Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unpack metadata-only sdists after downloading them with the BatchDownloader #12197

Merged
merged 6 commits into from
Aug 23, 2023

Conversation

cosmicexplorer
Copy link
Contributor

@cosmicexplorer cosmicexplorer commented Aug 2, 2023

Fixes #11847.

Problem

Following up on @pfmoore's comment at 7ad477e#r1280931823: the reason sdists are in an inconsistent state when we have metadata is because we don't unpack the sdist after downloading it, the way we do with unpack_url() all in one go in the normal code path (see https://github.com/cosmicexplorer/pip/blob/e860f2d0534d2ac6958d6ce34d134c0882a030c3/src/pip/_internal/operations/prepare.py#L593-L602).

Solution

  • Introduce the .needs_unpacked_archive() and .ensure_pristine_source_checkout() methods to InstallRequirement.
  • When we download an sdist via _complete_partial_requirements() with the BatchDownloader, make sure to unpack it after preparing the source directory in ._prepare_linked_requirement().
  • Add a test case test_download_metadata_server() that verifies we only ever download any dist exactly once when metadata is available.

tests/conftest.py Outdated Show resolved Hide resolved
@cosmicexplorer cosmicexplorer force-pushed the fix-metadata-sdist-download branch 2 times, most recently from 2294b81 to cc9e3bd Compare August 3, 2023 10:23
@cosmicexplorer cosmicexplorer changed the title fix #11847 for sdists unpack metadata-only sdists after downloading them with the BatchDownloader Aug 3, 2023
Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change combining with the added comments makes sense to me. I assume some of the weird structure (for example the added if not req.is_wheel block kind of duplicates code) would be cleaned up later.

src/pip/_internal/operations/prepare.py Outdated Show resolved Hide resolved
src/pip/_internal/operations/prepare.py Outdated Show resolved Hide resolved
@pfmoore
Copy link
Member

pfmoore commented Aug 10, 2023

I assume some of the weird structure (for example the added if not req.is_wheel block kind of duplicates code) would be cleaned up later.

By "would be cleaned up later" do you mean "in later commits to this PR" or "in a future PR"? Because I'm wary of the latter - we have enough hard-to-understand legacy code in pip without adding more, so I'd rather not merge this without those fixups being already completed (at least in a follow-up PR that's ready to merge).

Disclaimer: I haven't reviewed this PR (or any of @cosmicexplorer's current work) yet, as it all seems to be changing too fast at the moment for me to feel it's worth trying to keep up.

@cosmicexplorer
Copy link
Contributor Author

Oh, I'm sorry @pfmoore. I have been using the draft status for #12186 to indicate that it's not ready yet, but I did intend the undrafted ones to remain pretty stable. I'm not in a hurry to get this merged but this change in particular I think is quite orthogonal to the rest and specifically resolves the earlier issue you identified.

@cosmicexplorer
Copy link
Contributor Author

This one would also be helpful to get in earlier because it moves the test framework to generate a local pypi index into conftest.py, which is used by #12186 and others, but I'd love to make sure I'm not missing some existing test facility to do that.

@pfmoore
Copy link
Member

pfmoore commented Aug 10, 2023

It's not a problem, I mostly just monitor stuff via the emails that come through, and I haven't had time to keep close track of what's what between the various PRs. The emails don't make "this is a draft" particularly visible, which is also not ideal.

I'll see if I can find some time in the next few days to take a look at this, but don't wait on me doing so. As long as there's a definite action to keep things maintainable, that's my main concern.

@cosmicexplorer
Copy link
Contributor Author

Hmm, @uranusjr I also agree with you here:

(for example the added if not req.is_wheel block kind of duplicates code)

I can try to improve this.

@cosmicexplorer cosmicexplorer force-pushed the fix-metadata-sdist-download branch 6 times, most recently from 6347ea7 to 627c884 Compare August 12, 2023 10:51
@cosmicexplorer
Copy link
Contributor Author

Ok, this solution adds (yet another) attribute .archive_source to InstallRequirement, but it minimizes the changes in prepare.py, so I think it should be better than before.

@cosmicexplorer cosmicexplorer force-pushed the fix-metadata-sdist-download branch 2 times, most recently from 480cda8 to f7dd273 Compare August 12, 2023 14:01
Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@cosmicexplorer
Copy link
Contributor Author

Note that merging this would make it easier to review #12186 and #12208, as they make use of the metadata resolve test framework which was moved into conftest.py in this PR (but not substantially changed).

@cosmicexplorer
Copy link
Contributor Author

This change also conclusively solves an issue which blocked a prior release (and adds tests), without introducing any new behavior, so it should be easier to sign off on than those two. I'll fix up the OP so the commit message is useful.

@uranusjr
Copy link
Member

I’ll let this go in if no-one objects until the end of this weekend (anywhere on earth)

@uranusjr uranusjr merged commit fef9bd4 into pypa:main Aug 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pip downloads packages twice if the data-dist-info-metadata is set
4 participants