Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge extras in lockfile #5181

Merged
merged 1 commit into from
Jul 18, 2024
Merged

Conversation

konstin
Copy link
Member

@konstin konstin commented Jul 18, 2024

As user, you specify a list of extras. Internally, we decompose this into one virtual package per extra. We currently leak this abstraction by writing one entry per extra to the lockfile:

[[distribution]]
name = "foo"
version = "4.39.0.dev0"
source = { editable = "." }
dependencies = [
    { name = "pandas" },
    { name = "pandas", extra = "excel" },
    { name = "pandas", extra = "hdf5" },
    { name = "pandas", extra = "html", marker = "os_name != 'posix'" },
    { name = "pandas", extra = "output-formatting", marker = "os_name == 'posix'" },
    { name = "pandas", extra = "plot", marker = "os_name == 'posix'" },
]

Instead, we should merge the extras into a list of extras, creating a more concise lockfile:

[[distribution]]
name = "foo"
version = "4.39.0.dev0"
source = { editable = "." }
dependencies = [
    { name = "pandas", extra = ["excel", "hdf5"] },
    { name = "pandas", extra = ["html"], marker = "os_name != 'posix'" },
    { name = "pandas", extra = ["output-formatting", "plot"], marker = "os_name == 'posix'" },
]

The base package is now implicitly included, as it is in PEP 508.

Fixes #4888

As user, you specify a list of extras. Internally, we decompose this into one virtual package per extra. We currently leak this abstraction by writing one entry per extra to the lockfile:

```toml
[[distribution]]
name = "foo"
version = "4.39.0.dev0"
source = { editable = "." }
dependencies = [
    { name = "pandas" },
    { name = "pandas", extra = "excel" },
    { name = "pandas", extra = "hdf5" },
    { name = "pandas", extra = "html", marker = "os_name != 'posix'" },
    { name = "pandas", extra = "output-formatting", marker = "os_name == 'posix'" },
    { name = "pandas", extra = "plot", marker = "os_name == 'posix'" },
]
```

Instead, we should merge the extras into a list of extras, creating a more concise lockfile:

```toml
[[distribution]]
name = "foo"
version = "4.39.0.dev0"
source = { editable = "." }
dependencies = [
    { name = "pandas", extra = ["excel", "hdf5"] },
    { name = "pandas", extra = ["html"], marker = "os_name != 'posix'" },
    { name = "pandas", extra = ["output-formatting", "plot"], marker = "os_name == 'posix'" },
]
```

Fixes #4888
@konstin konstin added enhancement New feature or request preview Experimental behavior labels Jul 18, 2024
@konstin konstin requested a review from BurntSushi July 18, 2024 10:48
let new_dep = Dependency::from_annotated_dist(annotated_dist, marker);
for existing_dep in &mut self.dependencies {
if existing_dep.distribution_id == new_dep.distribution_id
&& existing_dep.marker == new_dep.marker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes add_dependency quadratic, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't figure out a clean way around this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only called from Lock::from_resolution_graph and it's un-exported, so it should be fine to build up some intermediate state that replaces this loop with a O(1) or O(logn) check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree. I briefly tried but had some trouble making it work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the large resolution transformers with all extras, this is the distribution of the number of iterations for the loop:

image

We run the code inside the loop 2762 times.

@charliermarsh charliermarsh merged commit 5bcdaed into main Jul 18, 2024
52 checks passed
@charliermarsh charliermarsh deleted the konsti/merge-extras-in-lockfile branch July 18, 2024 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request preview Experimental behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove extra duplication from lockfile
3 participants