Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Anchor relative paths in configurations #15634

Merged
merged 3 commits into from
Jan 23, 2025

Conversation

MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Jan 21, 2025

Summary

This PR adds support for relative paths in configurations and the CLI.

The challenge with relative paths is that they are relative to a different root based on where the configuration value comes from:

  • CLI: Paths are relative to the current working directory
  • Configuration: Paths are relative to the project's root

While Red Knot already supported relative paths from the CLI, it incorrecty
resolved relative paths in configuration files as relative to the current working directory.
This PR fixes this by implementing path anchoring. It introduces a new RelativePath type that we
resolve to an absolute path when going from "raw" Options to a *Settings struct.

A RelativePath is a combination of the path and a source indicating the origin of the path. This
tuple allows resolving the absolute path when given the project_root and system.

I plan to generalize the value with a source in a follow up PR where we also start
tracking the source span for every value to enable rich diagnostics for configuration errors.

Part of #15491

Alternatives

An alternative to using the thread_local for passing the ValueSource would be
to instead have a Options::apply_source or similar method that changes the source
for all values recursively. I decided against it because it would require a fair
bit of boilerplate (or a macro).

Test Plan

Added integration tests.

@MichaReiser MichaReiser added the red-knot Multi-file analysis & type inference label Jan 21, 2025
@MichaReiser MichaReiser marked this pull request as ready for review January 21, 2025 11:06
Copy link
Contributor

github-actions bot commented Jan 21, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Base automatically changed from micha/options-venv-path to main January 21, 2025 14:10
@sharkdp
Copy link
Contributor

sharkdp commented Jan 22, 2025

I did a coarse-grained review, but would like to understand the problem better before going into more detail.

Let me try to rephrase what you wrote just to see if I understand this correctly: The core problem is with options like extra-search-path which represents a list of paths. We could have one path in the list coming from a configuration file (anchored at the directory where the config file resides) and another path coming from the CLI (anchored at the CWD).

The approach you chose here is to attach that anchor to each and every path in the options struct, which leads to this slightly awkward solution with the thread-local because we create the options struct directly using serde.

Another approach might be to store a global anchor next to the options struct that we get from a particular source. For the scenario above, we would then have something like (/path/to/project/root, <options read from config file>) and (/path/to/cwd, <options read from CLI>) where each of the deserialized options structs would still contain bare relative paths. The obvious problem with this approach is that we can't Combine those, because for every path in the combined options struct, we wouldn't know what its original anchor was. But what if we turned those Options into Settings (with absolute paths) first, and then used Combine on the Settings? Would that be possible?

(Sorry if this is basically what you meant with the alternative solution in the PR description)

(Edit: the problem is even present for a config option with just a single path, not just with lists of paths)

@MichaReiser
Copy link
Member Author

Your summary is accurate and yes, the problem is that options get combined from multiple sources so that arrays end up with values relative to different roots.

What you describe is definetely possible and is roughly what Ruff does, except that it requires one extra representatios:

  • Options: The type we deserialize into. Matches the configuratino structure exactly
  • Configuration: The Options are converted into a Configuration. Multiple Configurations can be combined together.
  • Settings: The "resolved" values. That includes filling in default values or using representation optimized for reading (e.g. a configuration listing allowed-types uses a Vec<String> in Options but a Vec<Type> in `Settings).

This separation has proven to be somewhat painful in Ruff because there's a lot of redundancy: Each structure roughly defines the same fields, only sometimes with slightly different types.

There's a second problem that I only touched on in the summary: We want to validate the user configuration. Some of those validations are local to a single value and could be done when going from Options to Configuration, for example warning about unknown rules. However, other settings require a global view on all provided configuration values and can only be done once all configurations were merged. An example from Ruff is that we want to warn about conflicting rules. It isn't sufficient to only look at a single configuration before combining it because the user might disable one of the conflicting rules using the CLI.

Copy link
Contributor

@sharkdp sharkdp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you — looks great!

I don't love the thread-local solution, but your reasoning makes sense — thank you for writing it down!

@MichaReiser
Copy link
Member Author

I don't love the thread-local solution, but your reasoning makes sense — thank you for writing it down!

Neither do I :( It's a shame that there's no other solution for passing down state to serde.

We'll face a similar problem when implementing the JSON schema for rules because the schema also doesn't allow passing down any additional data but we'll need access to the rule registry

@dcreager
Copy link
Member

Neither do I :( It's a shame that there's no other solution for passing down state to serde.

Is there a way to do it using DeserializeSeed?

@MichaReiser
Copy link
Member Author

Is there a way to do it using DeserializeSeed?

Possibly. I didn't look too much into it, but it seems that you'd have to write your own Deserializer and recognize whenever a RelativePath gets deserialized but maybe this isn't too hard? I can explore that if we want as a follow up PR.

@dcreager
Copy link
Member

Possibly. I didn't look too much into it, but it seems that you'd have to write your own Deserializer and recognize whenever a RelativePath gets deserialized but maybe this isn't too hard? I can explore that if we want as a follow up PR.

If it's a pain, and the thread-local version is working, I'd say it's not worth it.

@MichaReiser MichaReiser merged commit 39e2df7 into main Jan 23, 2025
21 checks passed
@MichaReiser MichaReiser deleted the micha/path-anchoring branch January 23, 2025 09:14
dcreager added a commit that referenced this pull request Jan 24, 2025
* main:
  [red-knot] MDTests: Do not depend on precise public-symbol type inference (#15691)
  [red-knot] Make `infer.rs` unit tests independent of public symbol inference (#15690)
  Tidy knot CLI tests (#15685)
  [red-knot] Port comprehension tests to Markdown (#15688)
  Create Unknown rule diagnostics with a source range (#15648)
  [red-knot] Port 'deferred annotations' unit tests to Markdown (#15686)
  [red-knot] Support custom typeshed Markdown tests (#15683)
  Don't run the linter ecosystem check on PRs that only touch red-knot crates (#15687)
  Add `rules` table to configuration (#15645)
  [red-knot] Make `Diagnostic::file` optional (#15640)
  [red-knot] Add test for nested attribute access (#15684)
  [red-knot] Anchor relative paths in configurations (#15634)
  [`pyupgrade`] Handle multiple base classes for PEP 695 generics (`UP046`) (#15659)
  [`pyflakes`] Treat arguments passed to the `default=` parameter of `TypeVar` as type expressions (`F821`) (#15679)
  Upgrade zizmor to the latest version in CI (#15649)
  [`pyupgrade`] Add rules to use PEP 695 generics in classes and functions (`UP046`, `UP047`) (#15565)
  [red-knot] Ensure a gradual type can always be assigned to itself (#15675)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants