Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flake8-pyi] Implement autofix and handle nested unions with single element (PYI041 , PYI055) #14214

Closed
wants to merge 34 commits into from

Conversation

sbrugman
Copy link
Contributor

@sbrugman sbrugman commented Nov 8, 2024

Summary

PR to implement autofix for redundant-numeric-union (PYI041) and duplication-union-member (PYI016)
#14185

There appears to be an issue with the nested union detection when the union is unary:

>>> from typing import Union
>>> Union[int | int | float]
typing.Union[int, float]

>>> Union[Union[float, complex]]
typing.Union[float, complex]

This issue also affects at least PYI055.

Edit: the PR grew a bit bigger after finding issues with fix safety. I'll break up the PR into multiple for easier review.

Test Plan

Extension of the test cases and cargo test

#[derive_message_formats]
fn message(&self) -> String {
let (subtype, supertype) = match self.redundancy {
Redundancy::IntFloatComplex => ("int | float", "complex"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emitting multiple messages for the same union, such as "Use complex instead of int | complex" and "Use complex instead of float | complex" gets confusing when there is a fix presented:

-    def bad4(self, arg: Union[float | complex, int]) -> None: ...  # PYI041
+    def bad4(self, arg: complex) -> None: ...  # PYI041

With IntFloatComplex we can assign an unique violation for each combination.

let mut has_complex = false;
let mut has_int = false;
fn check_annotation<'a>(checker: &mut Checker, annotation: &'a Expr) {
let mut numeric_flags = NumericFlags::empty();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of has_*, I've used bitflags (which are used in some other parts of the code already).
(They correspond naturally to the order of Complex > Float > Int)

// Traverse the union a second time to construct the fix.
let mut flat_nodes: Vec<&Expr> = Vec::new();

let mut union_type = UnionLike::TypingUnion;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've chosen to do a second loop only when there is a violation, instead of collecting all expressions for any Union and filtering them later to avoid overhead in those cases.

Copy link
Contributor

github-actions bot commented Nov 8, 2024

ruff-ecosystem results

Linter (stable)

ℹ️ ecosystem check detected linter changes. (+0 -1 violations, +74 -0 fixes in 5 projects; 49 projects unchanged)

apache/airflow (+0 -0 violations, +58 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select ALL

- airflow/decorators/__init__.pyi:117:25: PYI041 Use `float` instead of `int | float`
+ airflow/decorators/__init__.pyi:117:25: PYI041 [*] Use `float` instead of `int | float`
- airflow/decorators/__init__.pyi:256:25: PYI041 Use `float` instead of `int | float`
+ airflow/decorators/__init__.pyi:256:25: PYI041 [*] Use `float` instead of `int | float`
- airflow/jobs/job.py:308:39: PYI041 Use `float` instead of `int | float`
+ airflow/jobs/job.py:308:39: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/base_stats_logger.py:39:15: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/base_stats_logger.py:39:15: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/base_stats_logger.py:50:15: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/base_stats_logger.py:50:15: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/base_stats_logger.py:61:15: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/base_stats_logger.py:61:15: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/datadog_logger.py:113:16: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/datadog_logger.py:113:16: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/otel_logger.py:246:16: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/otel_logger.py:246:16: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/statsd_logger.py:117:16: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/statsd_logger.py:117:16: PYI041 [*] Use `float` instead of `int | float`
- airflow/utils/timezone.py:240:26: PYI041 Use `float` instead of `int | float`
+ airflow/utils/timezone.py:240:26: PYI041 [*] Use `float` instead of `int | float`
- airflow/utils/timezone.py:305:16: PYI041 Use `float` instead of `int | float`
+ airflow/utils/timezone.py:305:16: PYI041 [*] Use `float` instead of `int | float`
- performance/src/performance_dags/performance_dag/performance_dag_utils.py:175:27: PYI041 Use `float` instead of `int | float`
+ performance/src/performance_dags/performance_dag/performance_dag_utils.py:175:27: PYI041 [*] Use `float` instead of `int | float`
- performance/src/performance_dags/performance_dag/performance_dag_utils.py:337:31: PYI041 Use `float` instead of `int | float`
+ performance/src/performance_dags/performance_dag/performance_dag_utils.py:337:31: PYI041 [*] Use `float` instead of `int | float`
- providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:273:16: PYI041 Use `float` instead of `int | float`
+ providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:273:16: PYI041 [*] Use `float` instead of `int | float`
- providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:302:56: PYI041 Use `float` instead of `int | float`
+ providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:302:56: PYI041 [*] Use `float` instead of `int | float`
- providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:325:57: PYI041 Use `float` instead of `int | float`
+ providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:325:57: PYI041 [*] Use `float` instead of `int | float`
- providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:516:27: PYI041 Use `float` instead of `int | float`
+ providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:516:27: PYI041 [*] Use `float` instead of `int | float`
- providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:516:47: PYI041 Use `float` instead of `int | float`
+ providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:516:47: PYI041 [*] Use `float` instead of `int | float`
- providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:516:72: PYI041 Use `float` instead of `int | float`
+ providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:516:72: PYI041 [*] Use `float` instead of `int | float`
... 20 additional changes omitted for project

apache/superset (+0 -0 violations, +10 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select ALL

- superset/daos/query.py:68:52: PYI041 Use `float` instead of `int | float`
+ superset/daos/query.py:68:52: PYI041 [*] Use `float` instead of `int | float`
- superset/models/helpers.py:900:39: PYI016 Duplicate union member `str`
+ superset/models/helpers.py:900:39: PYI016 [*] Duplicate union member `str`
- superset/utils/cache.py:150:14: PYI041 Use `float` instead of `int | float`
+ superset/utils/cache.py:150:14: PYI041 [*] Use `float` instead of `int | float`
- superset/utils/core.py:349:24: PYI041 Use `float` instead of `int | float`
+ superset/utils/core.py:349:24: PYI041 [*] Use `float` instead of `int | float`
... 3 additional changes omitted for rule PYI041

bokeh/bokeh (+0 -0 violations, +6 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select ALL

- src/bokeh/core/property/factors.py:51:56: PYI016 Duplicate union member `tuple[str, str]`
+ src/bokeh/core/property/factors.py:51:56: PYI016 [*] Duplicate union member `tuple[str, str]`
- src/bokeh/core/property/factors.py:52:85: PYI016 Duplicate union member `tp.Sequence[tuple[str, str]]`
+ src/bokeh/core/property/factors.py:52:85: PYI016 [*] Duplicate union member `tp.Sequence[tuple[str, str]]`
- src/bokeh/plotting/contour.py:43:88: PYI016 Duplicate union member `ContourColor`
+ src/bokeh/plotting/contour.py:43:88: PYI016 [*] Duplicate union member `ContourColor`

pandas-dev/pandas (+0 -0 violations, +0 -0 fixes)


python/typeshed (+0 -1 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select E,F,FA,I,PYI,RUF,UP,W

- stdlib/statistics.pyi:160:15: PYI041 Use `float` instead of `int | float`

Changes by rule (2 rules affected)

code total + violation - violation + fix - fix
PYI041 67 0 1 66 0
PYI016 8 0 0 8 0

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+0 -5 violations, +74 -0 fixes in 4 projects; 50 projects unchanged)

apache/airflow (+0 -0 violations, +58 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- airflow/decorators/__init__.pyi:117:25: PYI041 Use `float` instead of `int | float`
+ airflow/decorators/__init__.pyi:117:25: PYI041 [*] Use `float` instead of `int | float`
- airflow/decorators/__init__.pyi:256:25: PYI041 Use `float` instead of `int | float`
+ airflow/decorators/__init__.pyi:256:25: PYI041 [*] Use `float` instead of `int | float`
- airflow/jobs/job.py:308:39: PYI041 Use `float` instead of `int | float`
+ airflow/jobs/job.py:308:39: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/base_stats_logger.py:39:15: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/base_stats_logger.py:39:15: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/base_stats_logger.py:50:15: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/base_stats_logger.py:50:15: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/base_stats_logger.py:61:15: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/base_stats_logger.py:61:15: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/datadog_logger.py:113:16: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/datadog_logger.py:113:16: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/otel_logger.py:246:16: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/otel_logger.py:246:16: PYI041 [*] Use `float` instead of `int | float`
- airflow/metrics/statsd_logger.py:117:16: PYI041 Use `float` instead of `int | float`
+ airflow/metrics/statsd_logger.py:117:16: PYI041 [*] Use `float` instead of `int | float`
- airflow/utils/timezone.py:240:26: PYI041 Use `float` instead of `int | float`
+ airflow/utils/timezone.py:240:26: PYI041 [*] Use `float` instead of `int | float`
- airflow/utils/timezone.py:305:16: PYI041 Use `float` instead of `int | float`
+ airflow/utils/timezone.py:305:16: PYI041 [*] Use `float` instead of `int | float`
- performance/src/performance_dags/performance_dag/performance_dag_utils.py:175:27: PYI041 Use `float` instead of `int | float`
+ performance/src/performance_dags/performance_dag/performance_dag_utils.py:175:27: PYI041 [*] Use `float` instead of `int | float`
- performance/src/performance_dags/performance_dag/performance_dag_utils.py:337:31: PYI041 Use `float` instead of `int | float`
+ performance/src/performance_dags/performance_dag/performance_dag_utils.py:337:31: PYI041 [*] Use `float` instead of `int | float`
- providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:273:16: PYI041 Use `float` instead of `int | float`
+ providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:273:16: PYI041 [*] Use `float` instead of `int | float`
- providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:302:56: PYI041 Use `float` instead of `int | float`
+ providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:302:56: PYI041 [*] Use `float` instead of `int | float`
- providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:325:57: PYI041 Use `float` instead of `int | float`
+ providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:325:57: PYI041 [*] Use `float` instead of `int | float`
- providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:516:27: PYI041 Use `float` instead of `int | float`
+ providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:516:27: PYI041 [*] Use `float` instead of `int | float`
- providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:516:47: PYI041 Use `float` instead of `int | float`
+ providers/src/airflow/providers/amazon/aws/hooks/batch_client.py:516:47: PYI041 [*] Use `float` instead of `int | float`
... 22 additional changes omitted for project

apache/superset (+0 -0 violations, +10 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- superset/daos/query.py:68:52: PYI041 Use `float` instead of `int | float`
+ superset/daos/query.py:68:52: PYI041 [*] Use `float` instead of `int | float`
- superset/models/helpers.py:900:39: PYI016 Duplicate union member `str`
+ superset/models/helpers.py:900:39: PYI016 [*] Duplicate union member `str`
- superset/utils/cache.py:150:14: PYI041 Use `float` instead of `int | float`
+ superset/utils/cache.py:150:14: PYI041 [*] Use `float` instead of `int | float`
- superset/utils/core.py:349:24: PYI041 Use `float` instead of `int | float`
+ superset/utils/core.py:349:24: PYI041 [*] Use `float` instead of `int | float`
... 3 additional changes omitted for rule PYI041

bokeh/bokeh (+0 -0 violations, +6 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- src/bokeh/core/property/factors.py:51:56: PYI016 Duplicate union member `tuple[str, str]`
+ src/bokeh/core/property/factors.py:51:56: PYI016 [*] Duplicate union member `tuple[str, str]`
- src/bokeh/core/property/factors.py:52:85: PYI016 Duplicate union member `tp.Sequence[tuple[str, str]]`
+ src/bokeh/core/property/factors.py:52:85: PYI016 [*] Duplicate union member `tp.Sequence[tuple[str, str]]`
- src/bokeh/plotting/contour.py:43:88: PYI016 Duplicate union member `ContourColor`
+ src/bokeh/plotting/contour.py:43:88: PYI016 [*] Duplicate union member `ContourColor`

python/typeshed (+0 -5 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select E,F,FA,I,PYI,RUF,UP,W

- stdlib/random.pyi:45:31: PYI041 Use `float` instead of `int | float`
- stdlib/random.pyi:52:27: PYI041 Use `float` instead of `int | float`
- stdlib/statistics.pyi:160:15: PYI041 Use `float` instead of `int | float`
- stdlib/turtle.pyi:443:16: PYI041 Use `float` instead of `int | float`
- stdlib/turtle.pyi:444:17: PYI041 Use `float` instead of `int | float`

Changes by rule (2 rules affected)

code total + violation - violation + fix - fix
PYI041 71 0 5 66 0
PYI016 8 0 0 8 0

impl Redundancy {
pub(super) fn from_numeric_flags(numeric_flags: NumericFlags) -> Option<Self> {
match numeric_flags.bits() {
0b0110 => Some(Self::FloatComplex),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must be a more elegant way of doing this, but I haven't found it yet.


// Generate a [`Fix`] for two or more type expressions, e.g. `int | float | complex`.
fn generate_bit_or_fix(checker: &Checker, nodes: Vec<&Expr>, annotation: &Expr) -> Fix {
debug_assert!(nodes.len() >= 2, "At least two nodes required");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code above handles nodes.len() < 2 separately and thus this function should never encounter these cases. Is debug_assert! ok to denote this?

type_exprs.push(slice);
} else {
other_exprs.push(expr);
let mut has_pep604_union = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using the same Union syntax as the top-level union, I check if any PEP604-style union is encountered. If so, then we use that. (a | b or typing.Union[a, b])

@sbrugman
Copy link
Contributor Author

Raises as separate PRs to reduce the review load in #14280, #14275, #14273, #14272, #14270 and #14268.

@sbrugman sbrugman closed this Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant