Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using uv as an optional alternate resolver. #2371

Open
benjyw opened this issue Feb 16, 2024 · 23 comments
Open

Consider using uv as an optional alternate resolver. #2371

benjyw opened this issue Feb 16, 2024 · 23 comments

Comments

@benjyw
Copy link
Collaborator

benjyw commented Feb 16, 2024

uv is a new resolver and installer written in Rust. It claims to be substantially faster than Pip at resolving (and installing, although that is less relevant to us). It also claims to expose a CLI interface compatible with pip, so it supposedly can be used as a drop-in replacement. It notably does not claim to generate the same resolves as Pip (it uses PubGrub as the underlying solver).

See https://astral.sh/blog/uv for more.

uv makes some weighty claims, but we know from experience that the Python packaging ecosystem is messy, with many edge cases, ad-hoc behaviors, sharp edges, and de-facto standards that are not codified anywhere. It's as yet unclear how truly functional, or pip-compatible, uv is in real-world cases. So uv still requires substantial vetting by the community.

This ticket is to track and discuss the idea of embedding uv in Pex as an optional alternate resolver. We can evaluate potential benefits and drawbacks as uv gets more real-world usage, and if we see practical performance gains, before committing any effort to this.

@jsirois
Copy link
Member

jsirois commented Feb 16, 2024

Yeah, saw that. It won't be useable for --style universal locks until they tackle multiplatform (or we contribute it) - which they call out as a TODO. Ditto --platform and --complete-platform PEXes. So the work here, without support for that stuff, would be injecting new special case logic to fail fast in the execution paths that see those flags.

It's also nominally 3.8+. The existing PipVersion infra does handle ranges of Python Pip works for though; so that bump should be handleable.

Honestly though, if they achieve their vision and it remains Apache 2 / MIT, Pex should ~die and perhaps only live on as a package format uv produces. When I get a chance I always lobby Python / PyPA for 1 true tool, and that is what uv may become.

@jsirois
Copy link
Member

jsirois commented Feb 16, 2024

Maybe by the time the uv rise to the one true tool comes to pass - which is insane - Your language's tool in another language becuase yours is too damn slow - Mojo will have killed off Python! Hah.

@benjyw
Copy link
Collaborator Author

benjyw commented Feb 17, 2024

Maybe Python becomes an intermediate scripting language that GenAI transpiles to Rust.

@jsirois
Copy link
Member

jsirois commented Feb 18, 2024

Lots of people reacting to this - I'm going to assume that means folks agree Python / Pip / Pex are too slow right now and there is demand for the fast uv offers.

If so, and you're feeling lock resolve pain, this has been hanging there for a long time now:

That's probably fairly involved, but if any one wants to dive in, that'd be great. I'm a bit async depending on climbing, but I'm happy to answer questions, review PRs, etc.

On the slow PEX build end, which is mainly for larger PEX zips, there is the relatively recent Pex support for --no-pre-install-wheels which saves alot of time for PEXes with large distributions like PyTorch (no unzip / re-zip when creating a PEX).

As to other pain points related to speed, I may have lost track and would certainly appreciate folks speaking up here. I could then organize and break out issues for any problems that seem solvable.

@jsirois
Copy link
Member

jsirois commented Feb 19, 2024

One thing that struck me is most folks are probably ignorant of pip --use-{feature,deprecated} which Pex has used to provide a toggle between the --resolver-version pip-{legacy,2020}-resolver for many years now. In particular I'm betting most folks are unaware of --use-feature fast-deps which attempts to use range requests / exploit zip structure since Pip 20.2 (released July 28th 2020) which means in every version of Pip that Pex supports (i.e.: >=20.3.4). The feature has evolved and I'm not sure how well it works, but I've filed #2375 to at least expose a pass through to the Pip --use-feature option.

@jsirois
Copy link
Member

jsirois commented Feb 22, 2024

Ok, the --use-feature can be passed via requirements file already and it turns out via a saga documented in #2375 that fast-deps is slower for now anyhow.

@cosmicexplorer
Copy link
Contributor

One thing that struck me is most folks are probably ignorant of pip --use-{feature,deprecated} which Pex has used to provide a toggle between the --resolver-version pip-{legacy,2020}-resolver for many years now. In particular I'm betting most folks are unaware of --use-feature fast-deps which attempts to use range requests / exploit zip structure since Pip 20.2 (released July 28th 2020) which means in every version of Pip that Pex supports (i.e.: >=20.3.4). The feature has evolved and I'm not sure how well it works, but I've filed #2375 to at least expose a pass through to the Pip --use-feature option.

fast-deps will eventually not be a distinct option. After a lot of work from me and pip maintainers over the past two years, metadata has been successfully decoupled from actual downloads so that it can this can be an implementation detail (see e.g. pypa/pip#12256, or pypa/pip#12258 for even further caching behaviors). Also see pypa/pip#12208 which fixes the wheel metadata impl to be much faster (which I've successfully convinced other tools to steal and reuse, so that the poetry maintainer actually left a review on that pip PR with his own feedback). I expect all of this to get merged in the next few months, the underlying impl has been pretty stable for a while.

@cosmicexplorer
Copy link
Contributor

As per #2210, pip install --report --dry-run --ignore-installed can be much faster than invoking pip directly due to several optimizations including the ability to avoid downloading any artifacts whatsoever (instead just recording the download URLs and checksums): see pypa/pip#12186 for an example. As @jsirois mentioned previously regarding this strategy as well as above, it's likely to be most immediately useful for incremental resolves as opposed to generating universal lockfiles (which pip's resolver logic would probably need to be modified to generate).

@cosmicexplorer
Copy link
Contributor

I'm also interested in adapting uv as an alternate resolver, although with the metadata caching improvements I have been waiting to get merged in pip I would be surprised if uv was able to perform a resolve significantly faster than pip install --report --dry-run --ignore-installed | jq even if it does use Rust. I have been speaking with some people working on uv about this exact performance question though and may be able to investigate this soon.

@brendan-morin
Copy link

Chiming in: After seeing uv announced (and the enormous hype) it's clear the Astral team has quite effectively put a finger on the pulse of the python community and has created momentum and excitement in a way we don't often see.

It's too early to understand exactly where things will end up, but there is a decent chance that if uv executes on their vision, it has a legitimate chance to become ubiquitous as a third party package management tool in a similar way that e.g. requests become the defacto standard for python http interface.

In light on this, I think it would make sense for pants to lean into the zeitgeist, and embrace seamless uv integration as a core feature.

@benjyw
Copy link
Collaborator Author

benjyw commented Mar 20, 2024

Thanks for the input @BrendanJM ! This ticket is specifically about Pex adopting uv, which would be one way Pants could do so, but not the only way.

I've just reopened pantsbuild/pants#20679 to discuss Pants using uv directly. Since your post references Pants, that might be a good place to cross-post it.

@brendan-morin
Copy link

Doh… I followed this issue from the pants issue, didn’t realize I ended up in a different project 🫠

@astrojuanlu
Copy link

FWIW, uv pip compile --universal is now a thing

@jsirois
Copy link
Member

jsirois commented Jul 14, 2024

@astrojuanlu that's good to know, however not enough. uv pip compile just emits hashed version pins (with markers) - it does not also provide distribution metadata which Pex needs in its locks to support subsetting from a lock.

I want to be clear. If someone wants to contribute support they are free to try, but there are many features Pex needs to continue to support (it never breaks existing users), including lock subsetting, --platform, --complete-platform and I'm not sure what others may be challenging with a uv backend. If you can accept these development constraints that Pex imposes on itself and dive in and put the effort into learning all the corners - then thanks in advance.

@NiklasRosenstein
Copy link

NiklasRosenstein commented Jul 15, 2024

If having Pex use UV is opt-in and does not become the new default, having it not support various options from the start is not a breaking change. Adding support for UV in Pex incrementally is an option thus, no?

@jsirois
Copy link
Member

jsirois commented Jul 15, 2024

Yes, but I think no one here has any clue what that actually means. I would love someone to step up and wrestle with how much special case code they do or do not need to make this happen. I won't be getting to it personally for quite some time. There are many other items to chew through.

@kuza55
Copy link

kuza55 commented Aug 10, 2024

@jsirois Could you expand on what distribution metadata uv doesn't emit? Maybe we can ask them to emit that metadata.

@jsirois
Copy link
Member

jsirois commented Aug 10, 2024

@jsirois Could you expand on what distribution metadata uv doesn't emit? Maybe we can ask them to emit that metadata.

@kuza55 I think my point is "we" need to dig in. If you want to get Pex using uv as an alternate resolver, you'll need to understand what data Pex gathers from Pip and how it patches Pip at runtime to support things like --platform and --complete-platform. I know people think I sound like an ass when I put things this way, but I'm serious. If you really actually want this, you need to want this enough to do work. Me spoon feeding you requirements does not help you get this done at all. It give you an illusion of making forward progress.

@jsirois
Copy link
Member

jsirois commented Aug 10, 2024

Let me sharpen this even more. IIUC there is not a desire to make Pex faster / better here, there is actually just a desire to get Pants performing better, and it happens to use Pex right now. Is that right? If so, does anyone here actually understand why Pants uses Pex? For example, the problem of sandbox construction latency for 10ks of files and the --layout packed affordance Pex makes for Pants to be able to handle this? Etc ... - I really think Pants folks need to understand what's going on before they can fix Pants let alone Pex.

@kuza55
Copy link

kuza55 commented Aug 10, 2024

I'm going to go out on a limb and assume most people (like me) who are interested in pants improvements don't know all these details and don't really want to become experts in python packaging or build systems either but are struggling with slow lockfile generation in pants. From my perspective there are a bunch of blockers that have been mentioned, but uv is also making progress on some of these and I am happy to play telephone if that makes this any easier.

Though it sounds like the issue here is not that uv doesn't support things, but that nobody actually wants to do the work here (myself included), which is a different place to leave this discussion rather than what uv does or does not support.

@jsirois
Copy link
Member

jsirois commented Aug 10, 2024

@kuza55 I think the issue is both, but agreed the bigger issue is no one willing to step up. As I said above, I won't be getting to it any time soon. I have already worked hard enough on pex3 lock {update,sync} features at Pants behest in the last year that Pants is still to actually use. I'll refrain from busting my ass further for Pants until it shows it has its act together enough to understand it's own issues before asking for features here it may or may not use.

@jsirois
Copy link
Member

jsirois commented Sep 13, 2024

@benjyw with #2512 you can use any alternate resolver that can plop out the resolve as a set of dists. That's not uv today since it does not yet have Pip fidelity with support for uv pip {download,wheel}. C.F.:

I've pushed back on folks not stepping in to do work. The same goes here, but its in a different project. Perhaps someone will feel more comfortable wading in over there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants