Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ruff] Implemented used-dummy-variable (RUF052) #14611

Merged
merged 22 commits into from
Dec 3, 2024

Conversation

Lokejoke
Copy link
Contributor

Summary

This PR implements wps rule unused-variable-accessed; discussed here.

Test Plan

cargo test

Copy link
Contributor

github-actions bot commented Nov 26, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+761 -1 violations, +0 -0 fixes in 19 projects; 36 projects unchanged)

DisnakeDev/disnake (+21 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ disnake/errors.py:80:17: RUF052 [*] Local dummy variable `_errors` is accessed
+ disnake/ext/commands/converter.py:1189:33: RUF052 [*] Local dummy variable `_GenericAlias` is accessed
+ disnake/ext/commands/converter.py:1294:9: RUF052 [*] Local dummy variable `_NoneType` is accessed
+ disnake/ext/commands/help.py:211:37: RUF052 [*] Local dummy variable `_original` is accessed
+ disnake/ext/commands/help.py:217:38: RUF052 [*] Local dummy variable `_original` is accessed
+ disnake/ext/commands/view.py:118:13: RUF052 [*] Local dummy variable `_escaped_quotes` is accessed
+ disnake/guild.py:5159:9: RUF052 [*] Local dummy variable `_id` is accessed
+ disnake/guild.py:5309:9: RUF052 [*] Local dummy variable `_id` is accessed
+ disnake/guild.py:5357:9: RUF052 [*] Local dummy variable `_id` is accessed
+ disnake/guild.py:5405:9: RUF052 [*] Local dummy variable `_id` is accessed
... 11 additional changes omitted for project

RasaHQ/rasa (+84 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ rasa/api.py:42:5: RUF052 [*] Local dummy variable `_endpoints` is accessed
+ rasa/core/actions/action.py:1218:9: RUF052 [*] Local dummy variable `_tracker` is accessed
+ rasa/core/actions/action.py:603:9: RUF052 [*] Local dummy variable `_events` is accessed
+ rasa/core/actions/forms.py:274:9: RUF052 [*] Local dummy variable `_tracker` is accessed
+ rasa/core/actions/forms.py:276:9: RUF052 [*] Local dummy variable `_action` is accessed
+ rasa/core/actions/two_stage_fallback.py:85:9: RUF052 [*] Local dummy variable `_user_clarified` is accessed
+ rasa/core/exporter.py:235:13: RUF052 [*] Local dummy variable `_events` is accessed
+ rasa/core/policies/ted_policy.py:1786:21: RUF052 [*] Local dummy variable `_text_output` is accessed
+ rasa/core/policies/ted_policy.py:1787:21: RUF052 [*] Local dummy variable `_text_sequence_lengths` is accessed
+ rasa/core/policies/ted_policy.py:2144:9: RUF052 [*] Local dummy variable `_logits` is accessed
... 74 additional changes omitted for project

apache/airflow (+238 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ airflow/api_connexion/endpoints/pool_endpoint.py:135:9: RUF052 [*] Local dummy variable `_patch_body` is accessed
+ airflow/assets/manager.py:280:5: RUF052 [*] Local dummy variable `_asset_manager_class` is accessed
+ airflow/assets/manager.py:285:5: RUF052 [*] Local dummy variable `_asset_manager_kwargs` is accessed
+ airflow/cli/commands/connection_command.py:152:5: RUF052 [*] Local dummy variable `_connection_types` is accessed
+ airflow/cli/commands/info_command.py:263:9: RUF052 [*] Local dummy variable `_locale` is accessed
+ airflow/cli/commands/task_command.py:456:9: RUF052 [*] Local dummy variable `_dag` is accessed
+ airflow/configuration.py:1300:13: RUF052 [*] Local dummy variable `_section` is accessed
+ airflow/configuration.py:887:9: RUF052 [*] Local dummy variable `_extra_stacklevel` is accessed
+ airflow/dag_processing/processor.py:237:26: RUF052 [*] Local dummy variable `_child_channel` is accessed
+ airflow/dag_processing/processor.py:237:9: RUF052 [*] Local dummy variable `_parent_channel` is accessed
+ airflow/decorators/base.py:483:9: RUF052 [*] Local dummy variable `_MappedOperator` is accessed
+ airflow/decorators/setup_teardown.py:42:19: RUF052 [*] Local dummy variable `_func` is accessed
+ airflow/executors/executor_loader.py:211:13: RUF052 [*] Local dummy variable `_executor_name` is accessed
+ airflow/io/path.py:394:9: RUF052 [*] Local dummy variable `_kwargs` is accessed
+ airflow/io/path.py:408:9: RUF052 [*] Local dummy variable `_kwargs` is accessed
... 223 additional changes omitted for project

apache/superset (+42 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ superset/common/query_object_factory.py:47:9: RUF052 [*] Local dummy variable `_datasource_dao` is accessed
+ superset/connectors/sqla/models.py:461:17: RUF052 [*] Local dummy variable `_columns` is accessed
+ superset/db_engine_specs/presto.py:149:13: RUF052 [*] Local dummy variable `_column` is accessed
+ superset/migrations/versions/2018-11-12_13-31_4ce8df208545_migrate_time_range_for_default_filters.py:71:29: RUF052 [*] Local dummy variable `__from` is accessed
+ superset/migrations/versions/2018-11-12_13-31_4ce8df208545_migrate_time_range_for_default_filters.py:72:29: RUF052 [*] Local dummy variable `__to` is accessed
+ superset/models/helpers.py:1648:21: RUF052 [*] Local dummy variable `_sql` is accessed
+ superset/models/helpers.py:1649:21: RUF052 [*] Local dummy variable `_column_label` is accessed
+ superset/models/helpers.py:1896:25: RUF052 [*] Local dummy variable `_since` is accessed
+ superset/models/helpers.py:1896:33: RUF052 [*] Local dummy variable `_until` is accessed
+ superset/utils/date_parser.py:180:5: RUF052 [*] Local dummy variable `_relative_start` is accessed
... 32 additional changes omitted for project

bokeh/bokeh (+21 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ src/bokeh/application/handlers/code_runner.py:219:9: RUF052 [*] Local dummy variable `_cwd` is accessed
+ src/bokeh/application/handlers/code_runner.py:220:9: RUF052 [*] Local dummy variable `_sys_path` is accessed
+ src/bokeh/application/handlers/code_runner.py:221:9: RUF052 [*] Local dummy variable `_sys_argv` is accessed
+ src/bokeh/command/subcommands/static.py:79:9: RUF052 [*] Local dummy variable `_allowed_keys` is accessed
+ src/bokeh/core/has_props.py:521:28: RUF052 [*] Local dummy variable `_with_props` is accessed
+ src/bokeh/core/property/bases.py:450:9: RUF052 [*] Local dummy variable `_type_params` is accessed
+ src/bokeh/embed/standalone.py:302:15: RUF052 [*] Local dummy variable `_always_new` is accessed
+ src/bokeh/layouts.py:117:5: RUF052 [*] Local dummy variable `_children` is accessed
+ src/bokeh/layouts.py:152:5: RUF052 [*] Local dummy variable `_children` is accessed
+ src/bokeh/layouts.py:190:5: RUF052 [*] Local dummy variable `_children` is accessed
... 11 additional changes omitted for project

ibis-project/ibis (+34 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ .github/workflows/algolia/upload-algolia-api.py:174:9: RUF052 [*] Local dummy variable `_creator` is accessed
+ ibis/backends/datafusion/__init__.py:699:31: RUF052 [*] Local dummy variable `_conn` is accessed
+ ibis/backends/datafusion/__init__.py:732:33: RUF052 [*] Local dummy variable `_conn` is accessed
+ ibis/backends/datafusion/__init__.py:739:33: RUF052 [*] Local dummy variable `_conn` is accessed
+ ibis/backends/datafusion/__init__.py:746:33: RUF052 [*] Local dummy variable `_conn` is accessed
+ ibis/backends/datafusion/__init__.py:753:40: RUF052 [*] Local dummy variable `_conn` is accessed
+ ibis/backends/datafusion/__init__.py:760:38: RUF052 [*] Local dummy variable `_conn` is accessed
+ ibis/backends/datafusion/__init__.py:767:37: RUF052 [*] Local dummy variable `_conn` is accessed
+ ibis/backends/datafusion/__init__.py:774:37: RUF052 [*] Local dummy variable `_conn` is accessed
+ ibis/backends/datafusion/__init__.py:781:47: RUF052 [*] Local dummy variable `_conn` is accessed
... 24 additional changes omitted for project

latchbio/latch (+6 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ src/latch_cli/services/get_params.py:140:17: RUF052 [*] Local dummy variable `_enum_literal` is accessed
+ src/latch_cli/services/get_params.py:146:21: RUF052 [*] Local dummy variable `_enum_literal` is accessed
+ src/latch_cli/services/launch.py:143:5: RUF052 [*] Local dummy variable `_interface_request` is accessed
+ src/latch_cli/services/launch.py:208:5: RUF052 [*] Local dummy variable `_interface_request` is accessed
+ src/latch_cli/services/register/utils.py:127:5: RUF052 [*] Local dummy variable `_env` is accessed
+ src/latch_cli/services/register/utils.py:138:5: RUF052 [*] Local dummy variable `_serialize_cmd` is accessed

lnbits/lnbits (+5 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ tests/wallets/fixtures/models.py:35:9: RUF052 [*] Local dummy variable `_mock` is accessed
+ tests/wallets/helpers.py:117:13: RUF052 [*] Local dummy variable `_assert` is accessed
+ tests/wallets/helpers.py:80:36: RUF052 [*] Local dummy variable `_test_data` is accessed
+ tests/wallets/test_rpc_wallets.py:159:17: RUF052 [*] Local dummy variable `_mock_class` is accessed
+ tests/wallets/test_rpc_wallets.py:66:9: RUF052 [*] Local dummy variable `_mock_class` is accessed

milvus-io/pymilvus (+16 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ pymilvus/bulk_writer/local_bulk_writer.py:111:9: RUF052 [*] Local dummy variable `_async` is accessed
+ pymilvus/client/abstract.py:237:9: RUF052 [*] Local dummy variable `_dict` is accessed
+ pymilvus/client/abstract.py:818:13: RUF052 [*] Local dummy variable `_start` is accessed
+ pymilvus/client/abstract.py:819:13: RUF052 [*] Local dummy variable `_end` is accessed
+ pymilvus/client/abstract.py:820:13: RUF052 [*] Local dummy variable `_step` is accessed
+ pymilvus/client/abstract.py:90:9: RUF052 [*] Local dummy variable `_dict` is accessed
+ pymilvus/client/grpc_handler.py:1172:9: RUF052 [*] Local dummy variable `_async` is accessed
+ pymilvus/client/grpc_handler.py:962:9: RUF052 [*] Local dummy variable `_async` is accessed
+ pymilvus/decorators.py:56:13: RUF052 [*] Local dummy variable `_timeout` is accessed
+ pymilvus/decorators.py:57:13: RUF052 [*] Local dummy variable `_retry_times` is accessed
... 6 additional changes omitted for project

... Truncated remaining completed project reports due to GitHub comment length restrictions

Changes by rule (2 rules affected)

code total + violation - violation + fix - fix
RUF052 760 760 0 0 0
DOC201 2 1 1 0 0

@Lokejoke Lokejoke marked this pull request as ready for review November 26, 2024 13:58
@dylwil3 dylwil3 added the rule Implementing or modifying a lint rule label Nov 26, 2024
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This rule does make sense to me but I'm a bit concerned by the number of violations. What do you think @AlexWaygood?

I think the rule should respect https://docs.astral.sh/ruff/settings/#lint_dummy-variable-rgx as it is the "opposite" of the unused variable rule

I think we should change the rule to report the use of the binding rather than the variable declaration because I noticed when reviewing the ecosystem changes, that I always had to search for the variable use to assess whether the violation is correct. Doing so would match clippy's behavior

We should exclude bindings that are declared as global or nonlocal because it's not in the functions control to change the name https://github.com/RasaHQ/rasa/blob/b8de3b231126747ff74b2782cb25cb22d2d898d7/rasa/core/jobs.py#L22

@MichaReiser MichaReiser added the needs-decision Awaiting a decision from a maintainer label Nov 27, 2024
@AlexWaygood
Copy link
Member

This rule makes sense to me as well. I agree with @MichaReiser's comment here however:

We should exclude bindings that are declared as global or nonlocal because it's not in the functions control to change the name RasaHQ/rasa@b8de3b2/rasa/core/jobs.py#L22

Here's another interesting case: https://github.com/DisnakeDev/disnake/blob/2d0f91ad7042d4b331d448678fc6f88b74973947/disnake/guild.py#L5119. I'm not sure whether the rule should or shouldn't flag this -- they've used the leading underscore here specifically because otherwise the variable would shadow a Python builtin. PEP 8 recommends using a trailing underscore for situations like this rather than a leading underscore (id_ rather than _id), but it still seems to me like this is a pretty common pattern, and users might be confused if we complained about it. One solution might be to use a different error message if we can see that it would otherwise shadow a builtin, suggesting for them to use a trailing underscore rather than a leading underscore? Ideally I think we'd also emit that different error message if it would otherwise shadow any other variable, including globals, locals and nonlocals, not just builtins.

If we want to go that route, you can use this function to detect whether the symbol would shadow a Python builtin without the leading underscore:

/// Returns `true` if the given name is that of a Python builtin.
///
/// Intended to be kept in sync with [`python_builtins`].
pub fn is_python_builtin(name: &str, minor_version: u8, is_notebook: bool) -> bool {
And I believe that there are other methods on the semantic model that you can use to detect if a variables is bound in the current scope any enclosing scopes.

@MichaReiser MichaReiser removed the needs-decision Awaiting a decision from a maintainer label Nov 28, 2024
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This is starting to look pretty good

@Lokejoke
Copy link
Contributor Author

It's a pity that the dummy regex, by default, marks every name with a leading underscore as a dummy variable.

@AlexWaygood
Copy link
Member

It's a pity that the dummy regex, by default, marks every name with a leading underscore as a dummy variable.

Oh -- I think @MichaReiser possibly was suggesting for you to do the opposite of what you're doing there right now? It makes sense to me that this rule would flag all variables that match the dummy regex (in addition/instead of flagging all variables that have leading underscores) rather than not flagging all variables that match the dummy regex.

@Lokejoke
Copy link
Contributor Author

Ok, this does make sense.

@MichaReiser
Copy link
Member

We discussed the rule more internally and concluded that, considering the many violations, we should provide an auto fix.

You could reuse some logic from here

if checker.semantic().is_available("AbstractSet") {
diagnostic.try_set_fix(|| {
let semantic = checker.semantic();
let scope = &semantic.scopes[binding.scope];
let (edit, rest) =
Renamer::rename(name, "AbstractSet", scope, semantic, checker.stylist())?;
let applicability = determine_applicability(binding, scope, checker);
Ok(Fix::applicable_edits(edit, rest, applicability))
});
}

We must be careful only to offer a fix if the dummy-variable-regex allows variables starting with an underscore.

@Lokejoke
Copy link
Contributor Author

Lokejoke commented Nov 29, 2024

Do I list all options? For each dummy, we would match against:
_builtin => builtin_ | no fix if shadows
_var => var | var_ (if var is occupied) | no fix if shadows
dummy => no fix

Should we go back to one diagnostic with one fix renaming the variable?

@MichaReiser
Copy link
Member

Should we go back to one diagnostic with one fix renaming the variable?

The fix infrastructure should detect that it already applied the fix (or at least, it detects that the two fixes overlap) and skips it. It then rechecks the file, which should not yield any violations after the rename.

What we should do is set Diagnostic::set_parent so that all violations can be suppressed with a single noqa comment where the variable's defined

builtin => builtin | no fix if shadows
var => var | var (if var is occupied) | no fix if shadows
dummy => no fix

Generatign one fix each should be fine

  • builtins: Change from _builtin to -> bulitin_
  • shadowed: Change from _shadowed to -> shadowed_
  • dummy: Change from _dummy to -> dummy

But we can also start with adding a fix for dummy only, and then check the ecosystem results to see if it covers most variables.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @Lokejoke!

Comment on lines 106 to 119
let mut diagnostics: Vec<Diagnostic> = binding
.references
.iter()
.map(|ref_id| {
Diagnostic::new(
DummyVariableAccessed {
name: name.to_string(),
fix_kind: possible_fix_kind,
},
checker.semantic().reference(*ref_id).range(),
)
.with_parent(binding.start())
})
.collect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaReiser I know you earlier said:

I think we should change the rule to report the use of the binding rather than the variable declaration because I noticed when reviewing the ecosystem changes, that I always had to search for the variable use to assess whether the violation is correct. Doing so would match clippy's behavior

I think the disadvantages of that are:

  • The error here is really in the naming of the variable, not the fact that the variable is used. It's the name of the variable that we're suggesting to the user that they should change, and it's the name that we change for the user as part of the autofix. As such, I find it somewhat surprising that we'd issue diagnostics that point to the uses of the name rather than the name
  • If the user uses a dummy variable 5 times, I think this implementation currently means that we'd issue 5 diagnostics regarding the dummy variable. But arguably there's only one mistake here, which is the name of the variable at the point where it's bound.

Issuing a single diagnostic at the point where the variable is bound also has the advantage of being simpler and more performant in terms of our implementation

Copy link
Member

@MichaReiser MichaReiser Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error here is really in the naming of the variable, not the fact that the variable is used.

Not entirely. The error is the combination of the two. There are two possible errors:

  1. You used the wrong variable
  2. The variable is no longer unused

In the long term, I think what I'd want is a diagnostic like clippy where the primary diagnostic range is the use:

warning: used underscore-prefixed binding                                                                                                                                                                                                                                                                                                                                          
  --> crates\ruff_python_semantic\src\analyze\visibility.rs:38:5
   |
38 |     _decorator_list
   |     ^^^^^^^^^^^^^^^
   |
note: binding is defined here
  --> crates\ruff_python_semantic\src\analyze\visibility.rs:37:20
   |
37 | pub fn is_override(_decorator_list: &[Decorator], semantic: &SemanticModel) -> bool {
   |                    ^^^^^^^^^^^^^^^
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#used_underscore_binding
   = note: `-W clippy::used-underscore-binding` implied by `-W clippy::pedantic`
   = help: to override `-W clippy::pedantic` add `#[allow(clippy::used_underscore_binding)]`

warning: `ruff_python_semantic` (lib) generated 1 warning        

and the secondary diagnostic range is where the variable is declared.

The fact that the primary range is the use makes it much easier to review the violation because you can start with the use and editors provide the necessary tools to rename the variable from the use. You can also jump to the declaration if you want. The opposite is also true but finding all references tends to be slower and requires a few more clicks (or the use of a keyboard shortcut).

If the user uses a dummy variable 5 times, I think this implementation currently means that we'd issue 5 diagnostics regarding the dummy variable. But arguably there's only one mistake here, which is the name of the variable at the point where it's bound.

This can be avoided by only emitting a diagnostic for the first use of the binding.

Copy link
Member

@AlexWaygood AlexWaygood Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that the primary range is the use makes it much easier to review the violation because you can start with the use and editors provide the necessary tools to rename the variable from the use. You can also jump to the declaration if you want. The opposite is also true but finding all references tends to be slower and requires a few more clicks (or the use of a keyboard shortcut).

hmm, maybe this difference in perspective comes from the fact that I don't usually use a heavyweight IDE for writing Python: I usually use a lightweight text editor and run Ruff from the command line.

However, I still disagree... the vast majority of the hits we see in the ecosystem aren't due to people using the wrong variable accidentally, or due to a variable no longer being unused. Most hits are due to people naming the variable incorrectly because they're either not aware of this convention, don't care about it, or don't care about it enough to enforce it in their codebase without an automated tool. I think it'll be very odd for those people to have Ruff pointing to the first uses of the problematic variables when they upgrade to the latest version of Ruff, run Ruff (probably from the command line, I would guess, since they'll want to review all the modifications Ruff is making to their code in one go), and see many errors relating to this rule when the real problem is the way the variable is named. I also think it's usually pretty trivial if you are using a heavyweight editor to be able to right click on the variable and scroll through all references to it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel too strongly. So happy to go with your version if you do feel strongly. But two additional considerations:

  • The rule is named unused-variable-accessed. We should rename it if we report it at the declaration side because it doesn't report the use (either way, we should rename it to use)
  • In the long term, showing the first use and the declaration seems easier than showing the declaration and all usages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But two additional considerations:

* The rule is named `unused-variable-accessed`. We should rename it if we report it at the declaration side because it doesn't report the _use_ (either way, we should rename it to _use_)

* In the long term, showing the first use and the declaration seems easier than showing the declaration and **all** usages.

I agree with both of these. So I think the immediate action point here is to rename the rule (and the file the rule is in). Maybe variable-name-implies-unused? @Lokejoke, could you do that renaming?

Copy link
Member

@MichaReiser MichaReiser Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about allow used-dummy-variable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll do that

@AlexWaygood
Copy link
Member

@Lokejoke I wanted to apply some very minor last touchups and then merge, but GitHub won't let me push to your branch (not sure why). The final changes I wanted to apply were these, if you'd be able to apply them. Mostly they're cosmetic, but I switched the use of scope.has() to !semantic.is_available() in two places. This makes the autofix more cautious, as it also checks for the new proposed name of the variable in enclosing scopes as well as just in the local scope:

--- a/crates/ruff_linter/resources/test/fixtures/ruff/RUF052.py
+++ b/crates/ruff_linter/resources/test/fixtures/ruff/RUF052.py
@@ -88,19 +88,39 @@ x = "global"
 
 def fun():
     global x
-    _x = "shadows global" # [RUF052]
+    _x = "shadows global" # [RUF052] (but unfixable)
     return _x
 
 def foo():
   x = "outer"
   def bar():
     nonlocal x
-    _x = "shadows nonlocal" # [RUF052]
+    _x = "shadows nonlocal" # [RUF052] (but unfixable)
     return _x
   bar()
   return x
 
 def fun():
     x = "local"
-    _x = "shadows local" # [RUF052]
+    _x = "shadows local" # [RUF052] (but unfixable)
+    return _x
+
+def fun2():
+    x = "local"
+    x_ = "also local"
+    _x = "shadows local"  # [RUF052] (but unfixable)
     return _x
+
+def fun3():
+    global x_
+    x = "local"
+    _x = "shadows local"  # [RUF052] (but unfixable)
+    return _x
+
+def fun4():
+    x_ = 42
+    def fun5():
+        x = "local"
+        _x = "shadows local"  # RUF052 (but unfixable)
+        print(x_)
+        return _x
diff --git a/crates/ruff_linter/src/rules/ruff/rules/dummy_variable_accessed.rs b/crates/ruff_linter/src/rules/ruff/rules/dummy_variable_accessed.rs
index ca940bde3..e4a8dd28f 100644
--- a/crates/ruff_linter/src/rules/ruff/rules/dummy_variable_accessed.rs
+++ b/crates/ruff_linter/src/rules/ruff/rules/dummy_variable_accessed.rs
@@ -1,7 +1,7 @@
-use ruff_diagnostics::{Applicability, Diagnostic, Fix, FixAvailability, Violation};
+use ruff_diagnostics::{Diagnostic, Fix, FixAvailability, Violation};
 use ruff_macros::{derive_message_formats, ViolationMetadata};
 use ruff_python_ast::helpers::is_dunder;
-use ruff_python_semantic::{Binding, BindingKind, Scope};
+use ruff_python_semantic::{Binding, BindingKind, SemanticModel};
 use ruff_python_stdlib::{
     builtins::is_python_builtin, identifiers::is_identifier, keyword::is_keyword,
 };
@@ -21,6 +21,9 @@ use crate::{checkers::ast::Checker, renamer::Renamer};
 /// the code's intention. A variable marked as "unused" being subsequently used suggests oversight or unintentional use.
 /// This detracts from the clarity and maintainability of the codebase.
 ///
+/// Sometimes leading underscores are used to avoid variables shadowing other variables, Python builtins, or Python
+/// keywords. However, [PEP 8] recommends to use trailing underscores for this rather than leading underscores.
+///
 /// ## Example
 /// ```python
 /// def function():
@@ -40,6 +43,8 @@ use crate::{checkers::ast::Checker, renamer::Renamer};
 ///
 /// ## Options
 /// - [`lint.dummy-variable-rgx`]
+///
+/// [PEP 8]: https://peps.python.org/pep-0008/
 #[derive(ViolationMetadata)]
 pub(crate) struct DummyVariableAccessed {
     name: String,
@@ -96,8 +101,11 @@ pub(crate) fn dummy_variable_accessed(checker: &Checker, binding: &Binding) -> O
     if binding.is_global() || binding.is_nonlocal() {
         return None;
     }
+
+    let semantic = checker.semantic();
+
     // Only variables defined in function scopes
-    let scope = &checker.semantic().scopes[binding.scope];
+    let scope = &semantic.scopes[binding.scope];
     if !scope.kind.is_function() {
         return None;
     }
@@ -105,7 +113,7 @@ pub(crate) fn dummy_variable_accessed(checker: &Checker, binding: &Binding) -> O
         return None;
     }
 
-    let possible_fix_kind = get_possible_fix_kind(name, scope, checker);
+    let possible_fix_kind = get_possible_fix_kind(name, checker);
 
     let mut diagnostic = Diagnostic::new(
         DummyVariableAccessed {
@@ -118,12 +126,10 @@ pub(crate) fn dummy_variable_accessed(checker: &Checker, binding: &Binding) -> O
     // If fix available
     if let Some(fix_kind) = possible_fix_kind {
         // Get the possible fix based on the scope
-        if let Some(fix) = get_possible_fix(name, fix_kind, scope) {
+        if let Some(fix) = get_possible_fix(name, fix_kind, semantic) {
             diagnostic.try_set_fix(|| {
-                let (edit, rest) =
-                    Renamer::rename(name, &fix, scope, checker.semantic(), checker.stylist())?;
-                let applicability = Applicability::Safe;
-                Ok(Fix::applicable_edits(edit, rest, applicability))
+                Renamer::rename(name, &fix, scope, semantic, checker.stylist())
+                    .map(|(edit, rest)| Fix::safe_edits(edit, rest))
             });
         }
     }
@@ -145,7 +151,7 @@ enum ShadowedKind {
 }
 
 /// Suggests a potential alternative name to resolve a shadowing conflict.
-fn get_possible_fix(name: &str, kind: ShadowedKind, scope: &Scope) -> Option<String> {
+fn get_possible_fix(name: &str, kind: ShadowedKind, semantic: &SemanticModel) -> Option<String> {
     // Remove leading underscores for processing
     let trimmed_name = name.trim_start_matches('_');
 
@@ -157,8 +163,8 @@ fn get_possible_fix(name: &str, kind: ShadowedKind, scope: &Scope) -> Option<Str
         ShadowedKind::None => trimmed_name.to_string(),
     };
 
-    // Ensure the fix name is not already taken in the scope
-    if scope.has(&fix_name) {
+    // Ensure the fix name is not already taken in the scope or enclosing scopes
+    if !semantic.is_available(&fix_name) {
         return None;
     }
 
@@ -167,7 +173,7 @@ fn get_possible_fix(name: &str, kind: ShadowedKind, scope: &Scope) -> Option<Str
 }
 
 /// Determines the kind of shadowing or conflict for a given variable name.
-fn get_possible_fix_kind(name: &str, scope: &Scope, checker: &Checker) -> Option<ShadowedKind> {
+fn get_possible_fix_kind(name: &str, checker: &Checker) -> Option<ShadowedKind> {
     // If the name starts with an underscore, we don't consider it
     if !name.starts_with('_') {
         return None;
@@ -189,7 +195,7 @@ fn get_possible_fix_kind(name: &str, scope: &Scope, checker: &Checker) -> Option
         return Some(ShadowedKind::BuiltIn);
     }
 
-    if scope.has(trimmed_name) {
+    if !checker.semantic().is_available(trimmed_name) {
         return Some(ShadowedKind::Some);
     }

If you'd be able to make these final changes, I'm happy to merge now :-)

(Making these changes will also need some updates to the snapshots, since it makes the autofix more cautious -- some things in the fixtures which were previously deemed fixable are now no longer fixable.)

@Lokejoke
Copy link
Contributor Author

Does the semantic.is_available(&fix_name) do what we want, as we are no longer in the traversal process? Isn't the scope used - the last scope in AST?

@AlexWaygood
Copy link
Member

AlexWaygood commented Nov 30, 2024

Does the semantic.is_available(&fix_name) do what we want, as we are no longer in the traversal process? Isn't the scope used - the last scope in AST?

Oh, great catch! I've stepped on that footgun before, as well.

We can fix that by making some small tweaks to the semantic model. You can add the lookup_in_scope function that I propose adding in #13076, if you look at the changes I'm making there in crates/ruff_python_semantic/src/model.rs. (The PR is unfinished, but I thikn the changes I'm making to model.rs are solid.)

@AlexWaygood
Copy link
Member

Does the semantic.is_available(&fix_name) do what we want, as we are no longer in the traversal process? Isn't the scope used - the last scope in AST?

Oh, great catch! I've stepped on that footgun before, as well.

We can fix that by making some small tweaks to the semantic model. You can add the lookup_in_scope function that I propose adding in #13076, if you look at the changes I'm making there in crates/ruff_python_semantic/src/model.rs. (The PR is unfinished, but I thikn the changes I'm making to model.rs are solid.)

we'd then need to create our own version of is_available() in this rule that uses similar logic to SemanticModel::is_available(), but uses SemanticModel::lookup_in_scope() rather than SemanticModel::lookup()

@Lokejoke
Copy link
Contributor Author

Sounds fantastic! Though, I'll leave it for tomorrow.

@AlexWaygood
Copy link
Member

I think something like this should work. It compiles, but I haven't tested it:

diff --git a/crates/ruff_linter/src/rules/ruff/rules/dummy_variable_accessed.rs b/crates/ruff_linter/src/rules/ruff/rules/dummy_variable_accessed.rs
index e4a8dd28f..d90cf4ce5 100644
--- a/crates/ruff_linter/src/rules/ruff/rules/dummy_variable_accessed.rs
+++ b/crates/ruff_linter/src/rules/ruff/rules/dummy_variable_accessed.rs
@@ -1,7 +1,7 @@
 use ruff_diagnostics::{Diagnostic, Fix, FixAvailability, Violation};
 use ruff_macros::{derive_message_formats, ViolationMetadata};
 use ruff_python_ast::helpers::is_dunder;
-use ruff_python_semantic::{Binding, BindingKind, SemanticModel};
+use ruff_python_semantic::{Binding, BindingKind, ScopeId, SemanticModel};
 use ruff_python_stdlib::{
     builtins::is_python_builtin, identifiers::is_identifier, keyword::is_keyword,
 };
@@ -113,7 +113,7 @@ pub(crate) fn dummy_variable_accessed(checker: &Checker, binding: &Binding) -> O
         return None;
     }
 
-    let possible_fix_kind = get_possible_fix_kind(name, checker);
+    let possible_fix_kind = get_possible_fix_kind(name, checker, binding.scope);
 
     let mut diagnostic = Diagnostic::new(
         DummyVariableAccessed {
@@ -126,7 +126,7 @@ pub(crate) fn dummy_variable_accessed(checker: &Checker, binding: &Binding) -> O
     // If fix available
     if let Some(fix_kind) = possible_fix_kind {
         // Get the possible fix based on the scope
-        if let Some(fix) = get_possible_fix(name, fix_kind, semantic) {
+        if let Some(fix) = get_possible_fix(name, fix_kind, binding.scope, semantic) {
             diagnostic.try_set_fix(|| {
                 Renamer::rename(name, &fix, scope, semantic, checker.stylist())
                     .map(|(edit, rest)| Fix::safe_edits(edit, rest))
@@ -151,7 +151,12 @@ enum ShadowedKind {
 }
 
 /// Suggests a potential alternative name to resolve a shadowing conflict.
-fn get_possible_fix(name: &str, kind: ShadowedKind, semantic: &SemanticModel) -> Option<String> {
+fn get_possible_fix(
+    name: &str,
+    kind: ShadowedKind,
+    scope_id: ScopeId,
+    semantic: &SemanticModel,
+) -> Option<String> {
     // Remove leading underscores for processing
     let trimmed_name = name.trim_start_matches('_');
 
@@ -164,7 +169,7 @@ fn get_possible_fix(name: &str, kind: ShadowedKind, semantic: &SemanticModel) ->
     };
 
     // Ensure the fix name is not already taken in the scope or enclosing scopes
-    if !semantic.is_available(&fix_name) {
+    if !semantic.is_available_in_scope(&fix_name, scope_id) {
         return None;
     }
 
@@ -173,7 +178,7 @@ fn get_possible_fix(name: &str, kind: ShadowedKind, semantic: &SemanticModel) ->
 }
 
 /// Determines the kind of shadowing or conflict for a given variable name.
-fn get_possible_fix_kind(name: &str, checker: &Checker) -> Option<ShadowedKind> {
+fn get_possible_fix_kind(name: &str, checker: &Checker, scope_id: ScopeId) -> Option<ShadowedKind> {
     // If the name starts with an underscore, we don't consider it
     if !name.starts_with('_') {
         return None;
@@ -195,7 +200,10 @@ fn get_possible_fix_kind(name: &str, checker: &Checker) -> Option<ShadowedKind>
         return Some(ShadowedKind::BuiltIn);
     }
 
-    if !checker.semantic().is_available(trimmed_name) {
+    if !checker
+        .semantic()
+        .is_available_in_scope(trimmed_name, scope_id)
+    {
         return Some(ShadowedKind::Some);
     }
 
diff --git a/crates/ruff_python_semantic/src/model.rs b/crates/ruff_python_semantic/src/model.rs
index 9be76dc20..ff3a1a10a 100644
--- a/crates/ruff_python_semantic/src/model.rs
+++ b/crates/ruff_python_semantic/src/model.rs
@@ -325,9 +325,15 @@ impl<'a> SemanticModel<'a> {
     }
 
     /// Return `true` if `member` is an "available" symbol, i.e., a symbol that has not been bound
-    /// in the current scope, or in any containing scope.
+    /// in the current scope currently being visited, or in any containing scope.
     pub fn is_available(&self, member: &str) -> bool {
-        self.lookup_symbol(member)
+        self.is_available_in_scope(member, self.scope_id)
+    }
+
+    /// Return `true` if `member` is an "available" symbol in a given scope, i.e.,
+    /// a symbol that has not been bound in that current scope, or in any containing scope.
+    pub fn is_available_in_scope(&self, member: &str, scope_id: ScopeId) -> bool {
+        self.lookup_symbol_in_scope(member, scope_id, false)
             .map(|binding_id| &self.bindings[binding_id])
             .map_or(true, |binding| binding.kind.is_builtin())
     }
@@ -620,10 +626,22 @@ impl<'a> SemanticModel<'a> {
         }
     }
 
-    /// Lookup a symbol in the current scope. This is a carbon copy of [`Self::resolve_load`], but
-    /// doesn't add any read references to the resolved symbol.
+    /// Lookup a symbol in the current scope.
     pub fn lookup_symbol(&self, symbol: &str) -> Option<BindingId> {
-        if self.in_forward_reference() {
+        self.lookup_symbol_in_scope(symbol, self.scope_id, self.in_forward_reference())
+    }
+
+    /// Lookup a symbol in a certain scope
+    ///
+    /// This is a carbon copy of [`Self::resolve_load`], but
+    /// doesn't add any read references to the resolved symbol.
+    pub fn lookup_symbol_in_scope(
+        &self,
+        symbol: &str,
+        scope_id: ScopeId,
+        in_forward_reference: bool,
+    ) -> Option<BindingId> {
+        if in_forward_reference {
             if let Some(binding_id) = self.scopes.global().get(symbol) {
                 if !self.bindings[binding_id].is_unbound() {
                     return Some(binding_id);
@@ -633,7 +651,7 @@ impl<'a> SemanticModel<'a> {
 
         let mut seen_function = false;
         let mut class_variables_visible = true;
-        for (index, scope_id) in self.scopes.ancestor_ids(self.scope_id).enumerate() {
+        for (index, scope_id) in self.scopes.ancestor_ids(scope_id).enumerate() {
             let scope = &self.scopes[scope_id];
             if scope.kind.is_class() {
                 if seen_function && matches!(symbol, "__class__") {

@Lokejoke
Copy link
Contributor Author

Lokejoke commented Dec 1, 2024

Thank you very much @AlexWaygood! I have applied all of your proposed changes, which seemed very intuitive.

@MichaReiser MichaReiser added the preview Related to preview mode features label Dec 2, 2024
@MichaReiser
Copy link
Member

This is great :)

@MichaReiser
Copy link
Member

Considering that the rule is named unused-variable-accessed, should the rule instead use the range of the first access instead of the declaration of the symbol?

I think I'd prefer that because the violation isn't the assignment to the variable, but that I'm using a variable that's marked as intentionally unused. That would also make reviewing the ecosystem changes much easier, because I wouldn't have to scan for the use :)

@AlexWaygood
Copy link
Member

AlexWaygood commented Dec 2, 2024

Considering that the rule is named unused-variable-accessed, should the rule instead use the range of the first access instead of the declaration of the symbol?

I think I'd prefer that because the violation isn't the assignment to the variable, but that I'm using a variable that's marked as intentionally unused. That would also make reviewing the ecosystem changes much easier, because I wouldn't have to scan for the use :)

See my earlier comment at #14611 (comment) @MichaReiser. I think the thing we're recommending to the user that they should change (and the thing that we do change in the autofix) is the naming of the variable: we don't try to remove uses of the variable! So I much prefer it as it is currently, with the diagnostic pointing to the binding rather than the use of the binding.

@Lokejoke Lokejoke changed the title [ruff] Implemented unused-variable-accessed (RUF052) [ruff] Implemented used-dummy-variable (RUF052) Dec 2, 2024
@MichaReiser
Copy link
Member

Thanks for seeing this through!

@MichaReiser MichaReiser merged commit bf0fd04 into astral-sh:main Dec 3, 2024
21 checks passed
@MichaReiser
Copy link
Member

@AlexWaygood I know that you considered making some documentation changes to the rule but I thought I'd merge it and we can make those changes as separate PRs (this also solves the problem that you can't push to this branch 😆)

@AlexWaygood
Copy link
Member

Congrats @Lokejoke — there were a lot of unexpected complications to this one! This is great

dcreager added a commit that referenced this pull request Dec 3, 2024
* main:
  [`ruff`] Extend unnecessary-regular-expression to non-literal strings (`RUF055`) (#14679)
  Minor followups to RUF052 (#14755)
  [red-knot] Property tests (#14178)
  [red-knot] `is_subtype_of` fix for `KnownInstance` types (#14750)
  Improve docs for flake8-use-pathlib rules (#14741)
  [`ruff`] Implemented `used-dummy-variable` (`RUF052`) (#14611)
  [red-knot] Simplify tuples containing `Never` (#14744)
  Possible fix for flaky file watching test (#14543)
  [`flake8-import-conventions`] Improve syntax check for aliases supplied in configuration for `unconventional-import-alias (ICN001)` (#14745)
  [red-knot] Deeper understanding of `LiteralString` (#14649)
  red-knot: support narrowing for bool(E) (#14668)
  [`refurb`] Handle non-finite decimals in `verbose-decimal-constructor (FURB157)` (#14596)
  [red-knot] Re-enable linter corpus tests (#14736)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Related to preview mode features rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants