Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recreate Python venv and run pip install less often #912

Merged

Conversation

mcdonc
Copy link
Contributor

@mcdonc mcdonc commented Dec 24, 2023

  • Don't recreate Python venv unless Nix wrapper has changed.

  • Don't run pip install -r /nix/store/xxx-requirements.txt unless requirements have changed.

This does not take into account use of -r in requirements.txt and probably should.

This patch is against the python-rewrite branch.

Supersedes #904

@mcdonc
Copy link
Contributor Author

mcdonc commented Dec 24, 2023

To handle detecting changes in requirements specified via -r, we need to flatten requirements.

Looks like some flattening work was done in #889

@domenkozar
Copy link
Member

Did the Python tests fail before? 🤔

@mcdonc
Copy link
Contributor Author

mcdonc commented Dec 31, 2023

Did the Python tests fail before? 🤔

I'm glad there are so many tests to fail! I haven't looked at each and every one, but most seem unrelated, and similar to failings like the recent Rust PR.

Dumb question. How can I run a subset of the tests locally?

@mcdonc
Copy link
Contributor Author

mcdonc commented Dec 31, 2023

@mcdonc
Copy link
Contributor Author

mcdonc commented Jan 2, 2024

To handle detecting changes in requirements specified via -r, we need to flatten requirements.

Rather than using a Python lib to do this, we could parse the root requirements file and recursively concatenate all of the files referenced by -r into a single requirements.txt and recursively concatenate all the files referenced by -c into a single constraints.txt; this would negate the need to have a lib that was able to deal with envvars or -e or whatnot.

@mcdonc
Copy link
Contributor Author

mcdonc commented Jan 8, 2024

To handle detecting changes in requirements specified via -r, we need to flatten requirements.

Rather than using a Python lib to do this, we could parse the root requirements file and recursively concatenate all of the files referenced by -r into a single requirements.txt and recursively concatenate all the files referenced by -c into a single constraints.txt; this would negate the need to have a lib that was able to deal with envvars or -e or whatnot.

Like this: https://gist.github.com/mcdonc/2653bbfbd33032a12464c8f64c183aff

@bobvanderlinden
Copy link
Contributor

Did the Python tests fail before? 🤔

I suspect that the poetry test is also failing in the python rewrite PR:

https://github.com/cachix/devenv/actions/runs/7312821636/job/19926548605?pr=912#step:8:396

+ poetry run python -c 'import os; print(os.environ['\''LD_LIBRARY_PATH'\''])'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<frozen os>", line 679, in __getitem__
KeyError: 'LD_LIBRARY_PATH'

Copy link
Contributor

@bobvanderlinden bobvanderlinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, the goals of the PR should already have been met:

Don't recreate Python venv unless Nix wrapper has changed.

Using the condition:

if [ "$(${readlink} "$VENV_PATH"/bin/python)" != "$(${readlink} ${package.interpreter}/bin/python)" ]

Don't run pip install -r /nix/store/xxx-requirements.txt unless requirements have changed.

Using the condition:

[ "$(${readlink} "$VENV_PATH"/requirements.txt)" != "$(${readlink} ${if requirements != null then requirements else "$VENV_PATH/requirements.txt"})" ]

In what situation was this not working correctly? Do you have some reproducible steps to see the behaviour failing with the version on main?

${lib.optionalString cfg.poetry.enable ''
[ -f "${config.env.DEVENV_STATE}/poetry.lock.checksum" ] && rm ${config.env.DEVENV_STATE}/poetry.lock.checksum
''}
echo ${package.interpreter} -m venv "$VENV_PATH"
${package.interpreter} -m venv "$VENV_PATH"
echo ${package.interpreter} -m venv --upgrade-deps "$VENV_PATH"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't --upgrade-deps introduce reproducibility issues? .venv will have different versions of pip depending on when you install it. (running it today will install X, running it tomorrow will install Y)

I'm not 100% sure this is a good idea, whether it should be the default and whether it should be behind an option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with it not happening. I just wanted to get rid of the irritating "your pip is out of date" warnings :)

@mcdonc
Copy link
Contributor Author

mcdonc commented Jan 9, 2024

From what I understand, the goals of the PR should already have been met:

Don't recreate Python venv unless Nix wrapper has changed.

Using the condition:

if [ "$(${readlink} "$VENV_PATH"/bin/python)" != "$(${readlink} ${package.interpreter}/bin/python)" ]

This test will always return false currently. See #920 .

Don't run pip install -r /nix/store/xxx-requirements.txt unless requirements have changed.

Using the condition:

[ "$(${readlink} "$VENV_PATH"/requirements.txt)" != "$(${readlink} ${if requirements != null then requirements else "$VENV_PATH/requirements.txt"})" ]

In what situation was this not working correctly? Do you have some reproducible steps to see the behaviour failing with the version on main?

There is never a "requirements.txt" within the venv. There might be one within $DEVENV_ROOT, but it's not guaranteed to be named "requirements.txt". We also shouldn't need to recreate the venv when the requirements change, we should just need to rerun pip install of the requirements.

The current behavior for either means the venv is recreated on every invocation of devenv shell. The reproducible step is to run it once, exit the shell, then run it again without any changes and see the venv get recreated. The PR is against python-rewrite, so the behavior applies only to it.

We have a few bugs interacting:

  • Once we solve Python venv "python3.XX" links to base Python, not the Python binary wrapper  #920, we can put back the simpler readlink check. I only changed it because it always rebuilt, but it won't once we merge that python wrapper fix into nixpkgs.

  • Checking "$VENV_PATH" is just wrong here AFACIT. And if that's fixed, we don't want to recreate the venv, we just want to reinstall requirements. It's also pointless to do a requirements check and reinstall without flattening the requirements ("-r") and taking into account constraints ("-c"). Or maybe not pointless, but would be confusing to anyone who changed a requirement or constraint in an -r or -c included file and whom didn't see the requirements reinstalled. It would be better to just not have the feature, IMO.

@mcdonc
Copy link
Contributor Author

mcdonc commented Jan 9, 2024

Did the Python tests fail before? 🤔

I suspect that the poetry test is also failing in the python rewrite PR:

https://github.com/cachix/devenv/actions/runs/7312821636/job/19926548605?pr=912#step:8:396

+ poetry run python -c 'import os; print(os.environ['\''LD_LIBRARY_PATH'\''])'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<frozen os>", line 679, in __getitem__
KeyError: 'LD_LIBRARY_PATH'

AFAICT, there is no guarantee that LD_LIBRARY_PATH will exist within the shell?

@bobvanderlinden
Copy link
Contributor

Did the Python tests fail before? 🤔

I suspect that the poetry test is also failing in the python rewrite PR:
https://github.com/cachix/devenv/actions/runs/7312821636/job/19926548605?pr=912#step:8:396

+ poetry run python -c 'import os; print(os.environ['\''LD_LIBRARY_PATH'\''])'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<frozen os>", line 679, in __getitem__
KeyError: 'LD_LIBRARY_PATH'

AFAICT, there is no guarantee that LD_LIBRARY_PATH will exist within the shell?

With the current way .so libraries are linked, it is needed for some of the python packages to operate correctly. There probably is going something wrong in the python-rewrite branch that doesn't set the envvar when using poetry.

@mcdonc
Copy link
Contributor Author

mcdonc commented Jan 10, 2024

I'm giving up on the -r/-c requirements.txt flattening due to impurity restrictions exercised by Nix. I've put all I discovered about it in #889. This means the PR stands as-is delta the venv-recreate on interpreter-checking code, which can be simplified when #920 is fixed.

@mcdonc
Copy link
Contributor Author

mcdonc commented Jan 10, 2024

With the current way .so libraries are linked, it is needed for some of the python packages to operate correctly. There probably is going something wrong in the python-rewrite branch that doesn't set the envvar when using poetry.

We have some longstanding bugs/assumptions and multiple branches interacting, I think, which makes this complex and hard to describe, but I don't think that the python-poetry example should be relying on LD_LIBRARY_PATH, even though it happens to work on main.

Assumptions:

  • Something somewhere should be setting LD_LIBRARY_PATH such that DLLs needed by Python libraries can be found (true!)

Bugs:

  • Nothing whatsoever (except the user) sets an LD_LIBRARY_PATH for Python or the shell on the python-rewrite branch.

It looks like the main branch does indeed set an LD_LIBRARY_PATH in some cases, but I think it's incidental. And on the python-rewrite branch, it might exist within the Python process, but if all goes right, there is at least one scenario where it won't.

Rationale:

The python-rewrite branch, unlike main, creates a wrapper script (https://github.com/cachix/devenv/blob/python-rewrite/src/modules/languages/python.nix#L13) for the Python defined in languages.python.version, and has two new additional options: languages.python.libraries and languages.python.manylinux, which are meant to work in concert with the wrapper. This is what sets LD_LIBRARY_PATH for the Python process. This doesn't work currently but if #920 is fixed it will (and it must be, if we want to use a wrapper, and the wrapper is kinda the raison-d'etre of the branch AFACIT).

You can of course set env.LD_LIBRARY_PATH yourself and ignore languages.python.libraries. This will indeed put an LD_LIBRARY_PATH in the shell environment, but this is the only time one would necessarily be present. But if there isn't anything in languages.python.libraries and languages.python.manylinux is false, there won't necessarily be an LD_LIBRARY_PATH in Python's os.environ. There will not necessarily be an LD_LIBRARY_PATH in the shell env because all library path setting is presumed to be handled by the Python wrapper binary.

So I think the right fix is to change the test, because I think it's relying on an artifact of how main is working (which I don't actually understand, but I am pretty confident that the shell, even on main, might not have an LD_LIBRARY_PATH in it unless you set one explcitly).

@mcdonc
Copy link
Contributor Author

mcdonc commented Jan 10, 2024

@bobvanderlinden maybe you can help me understand the test in examples/python-poetry/.test.sh (or really any of the example tests)... is its output checked by some validator or is the test just that all of these commands complete?

@bobvanderlinden
Copy link
Contributor

This test will always return false currently. See #920 .

Hmm, this is probably a recent thing that broke it. It makes sense to fix it, but the current behaviour:

if [ "$(${readlink} "$VENV_PATH"/bin/python)" != "$(${readlink} ${package.interpreter}/bin/python)" ]

does still seem like it should work. Just that there is a bug in nixpkgs that results in a wrong "$VENV_PATH"/bin/python. A workaround in devenv makes sense while the issue is being resolved in nixpkgs.

There is never a "requirements.txt" within the venv. There might be one within $DEVENV_ROOT, but it's not guaranteed to be named "requirements.txt". We also shouldn't need to recreate the venv when the requirements change, we should just need to rerun pip install of the requirements.

Ah sorry, that line is indeed wrong. It should indeed be fixed.

Nothing whatsoever (except the user) sets an LD_LIBRARY_PATH for Python or the shell on the python-rewrite branch.

The main branch sets LD_LIBRARY_PATH for the entire shell environment. python-rewrite sets it inside the python wrapper. See https://github.com/cachix/devenv/pull/745/files#diff-f5bb89f776eda1a758db67c6c75b19ff1665b2f10998d17d1d98479835fc9e2eR16-R26. So if languages.python.libraries is set (which should be done for some of the python libraries that are being tested), LD_LIBRARY_PATH should be available inside python.

The .test.sh files merely need to succeed. The output does not matter, but the output can help quite a bit to get insight why a test fails. It was probably the reason to output LD_LIBRARY_PATH inside the test inside of python. It's likely that removing the line that is failing right now (printing LD_LIBRARY_PATH) will expose that loading zlib and pjsua will fail (because they aren't available in LD_LIBRARY_PATH). The test likely needs to move those packages from packages to languages.python.packages.

I'm not entirely sure why these tests aren't running on the python-rewrite branch. I'll try to create a PR that fixes the poetry test for the python-rewrite branch.

@mcdonc
Copy link
Contributor Author

mcdonc commented Jan 10, 2024

Nothing whatsoever (except the user) sets an LD_LIBRARY_PATH for Python or the shell on the python-rewrite branch.

The main branch sets LD_LIBRARY_PATH for the entire shell environment. python-rewrite sets it inside the python wrapper. See https://github.com/cachix/devenv/pull/745/files#diff-f5bb89f776eda1a758db67c6c75b19ff1665b2f10998d17d1d98479835fc9e2eR16-R26. So if languages.python.libraries is set (which should be done for some of the python libraries that are being tested), LD_LIBRARY_PATH should be available inside python.

#920 is preventing it from working currently, but that's my understanding as well.

The .test.sh files merely need to succeed. The output does not matter, but the output can help quite a bit to get insight why a test fails. It was probably the reason to output LD_LIBRARY_PATH inside the test inside of python. It's likely that removing the line that is failing right now (printing LD_LIBRARY_PATH) will expose that loading zlib and pjsua will fail (because they aren't available in LD_LIBRARY_PATH). The test likely needs to move those packages from packages to languages.python.packages.

I'm not entirely sure why these tests aren't running on the python-rewrite branch. I'll try to create a PR that fixes the poetry test for the python-rewrite branch.

Cool!

@domenkozar domenkozar merged commit 4179686 into cachix:python-rewrite Jan 12, 2024
114 of 177 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants