Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uv hangs on install #2300

Closed
hauntsaninja opened this issue Mar 8, 2024 · 10 comments · Fixed by #2373
Closed

uv hangs on install #2300

hauntsaninja opened this issue Mar 8, 2024 · 10 comments · Fixed by #2373
Assignees
Labels
bug Something isn't working great writeup A wonderful example of a quality contribution 💜

Comments

@hauntsaninja
Copy link
Contributor

hauntsaninja commented Mar 8, 2024

uv hangs on various installs for me. The hangs are weirdly deterministic, except for when I really want them to be deterministic and then they stop being so. Anyway, I've got it down to something small that repros reasonably often. The full unminimised case with like 1000 packages always hangs.

In the official Python 3.11 docker image, with uv --version equal to uv 0.1.16, run the following:

import subprocess
from collections import Counter
def run(cmd):
    try:
        subprocess.check_call(cmd, shell=True, timeout=10)
    except Exception as e:
        return type(e)
    return None
Counter(run("uv cache clean; rm -rf .venv; uv venv; uv pip install --no-cache-dir --no-deps --compile ddtrace==2.6.2 debugpy==1.8.1 decorator==5.1.1 defusedxml==0.7.1 ecdsa==0.18.0 ecos==2.0.13 editorconfig==0.12.4 einops==0.7.0") for _ in range(10))

You'll see something like Counter({subprocess.TimeoutExpired: 6, None: 3, subprocess.CalledProcessError: 1}) (and larger cases fail more reliably). The singleton crash is a spurious bytecode compilation failure, e.g.:

error: Failed to bytecode compile /root/tmpv/.venv/lib/python3.11/site-packages
  Caused by: Bytecode compilation failed, expected "/root/tmpv/.venv/lib/python3.11/site-packages/defusedxml/cElementTree.py", received: ""

You can't really reduce the set of packages further without the hangs going away, although you can get rid of --compile if you want

Anyway, uv making much more progress on installing big work environment from scratch than last time I tried :-)

@hauntsaninja hauntsaninja changed the title uv hangs, also bytecode compilation fails every now and then uv hangs on install, also sometimes bytecode compilation fails Mar 8, 2024
@konstin konstin added the great writeup A wonderful example of a quality contribution 💜 label Mar 8, 2024
@konstin
Copy link
Member

konstin commented Mar 8, 2024

The bytecode compilation failures look like #2245/#2278, but those should have been included in 0.1.16.

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Mar 8, 2024

Hm, that output was from a less reproducible environment, it's possible I messed up the uv version (I had a few requirements files that contained uv so probably clobbered myself). I'd seen it a few times and so wanted to include it in the issue.

I haven't seen the bytecode crash in my careful docker repro, which is definitely uv 0.1.16. I'll remove the part about bytecode from the title and let you know if I see it again — thanks for fixing!

@hauntsaninja hauntsaninja changed the title uv hangs on install, also sometimes bytecode compilation fails uv hangs on install Mar 8, 2024
@zanieb zanieb added the bug Something isn't working label Mar 8, 2024
@charliermarsh charliermarsh self-assigned this Mar 11, 2024
@charliermarsh
Copy link
Member

I think I'm able to reproduce this.

@charliermarsh
Copy link
Member

I was able to reduce it to ddtrace==2.6.2 debugpy==1.8.1 ecdsa==0.18.0 editorconfig==0.12.4.

@charliermarsh
Copy link
Member

Don't worry, I added a bunch of debug logging and it stopped hanging 😂

@hauntsaninja
Copy link
Contributor Author

Yeah, RUST_LOG=trace was a fix for me as well :D

@charliermarsh
Copy link
Member

I've identified the problem. I think it's specific to --no-deps...

@charliermarsh
Copy link
Member

Actually, to be more specific, I could see it manifesting without --no-deps, but --no-deps would make it way more likely.

charliermarsh added a commit that referenced this issue Mar 12, 2024
## Summary

When running under `--no-deps`, we don't need to pre-fetch, because
pre-fetching fetches the _distribution_ metadata. But with `--no-deps`,
we only need the package metadata for the top-level requirements. We
never need distribution metadata.

Incidentally, this will fix #2300.

## Test Plan

- `cargo test`
- `./target/debug/uv pip install --verbose --no-cache-dir --no-deps
--reinstall ddtrace==2.6.2 debugpy==1.8.1 ecdsa==0.18.0
editorconfig==0.12.4 --verbose` in a Python 3.10 Docker contain
repeatedly.
@charliermarsh
Copy link
Member

Fixed in the next release, thank you!

@hauntsaninja
Copy link
Contributor Author

Nice, thanks for solving it!

charliermarsh added a commit that referenced this issue Mar 12, 2024
## Summary

This is a more robust fix for
#2300.

The basic issue is:

- When we resolve, we attempt to pre-fetch the distribution metadata for
candidate packages.
- It's possible that the resolution completes _without_ those pre-fetch
responses. (In the linked issue, this was mainly because we were running
with `--no-deps`, but the pre-fetch was causing us to attempt to build a
package to get its dependencies. The resolution would then finish before
the build completed.)
- In that case, the `Index` will be marked as "waiting" for that
response -- but it'll never come through.
- If there's a subsequent call to the `Index`, to see if we should fetch
or are waiting for that response, we'll end up waiting for it forever,
since it _looks_ like it's in-flight (but isn't). (In the linked issue,
we had to build the source distribution for the install phase of `pip
install`, but `setuptools` was in this bad state from the _resolve_
phase.)

This PR modifies the resolver to ensure that we flush the stream of
requests before returning. Specifically, we now `join` rather than
`select` between the resolution and request-handling futures.

This _could_ be wasteful, since we don't _need_ those requests, but it
at least ensures that every `.wait` is followed by ` .done`. In
practice, I expect this not to have any significant effect on
performance, since we end up using the pre-fetched distributions almost
every time.

## Test Plan

I ran through the test plan from
#2373, but ran the build 10 times
and ensured it never crashed. (I reverted
#2373, since that _also_ fixes the
issue in the proximate case, by never fetching `setuptools` during the
resolve phase.)

I also added logging to verify that requests are being handled _after_
the resolution completes, as expected.

I also introduced an arbitrary error in `fetch` to ensure that the error
was immediately propagated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working great writeup A wonderful example of a quality contribution 💜
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants