Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip install . leaves a build directory that is causing issues for some tools #8165

Closed
omry opened this issue Apr 29, 2020 · 31 comments
Closed
Labels
project: <downstream> When the cause/effect is related to redistributors type: support User Support

Comments

@omry
Copy link
Contributor

omry commented Apr 29, 2020

Environment

  • pip version: 20.1
  • Python version: 3.6+
  • OS: Any

Description
Build-in-place leaves a build directory in the working directory, which can sometimes confuse tools like pytest when they discover pyc files inside.

Expected behavior

How to Reproduce

pip install .
pytest

Output

pytest
================================================================================== test session starts ==================================================================================
platform linux -- Python 3.8.2, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: /home/omry/dev/hydra, inifile: pytest.ini
plugins: snail-0.1.0
collected 899 items / 1 error / 898 selected                                                                                                                                            

======================================================================================== ERRORS =========================================================================================
_______________________________________________________________ ERROR collecting build/lib/hydra/test_utils/test_utils.py _______________________________________________________________
import file mismatch:
imported module 'hydra.test_utils.test_utils' has this __file__ attribute:
  /home/omry/dev/hydra/hydra/test_utils/test_utils.py
which is not the same as the test file we want to collect:
  /home/omry/dev/hydra/build/lib/hydra/test_utils/test_utils.py
HINT: remove __pycache__ / .pyc files and/or use a unique basename for your test file modules
================================================================================ short test summary info ================================================================================
ERROR build/lib/hydra/test_utils/test_utils.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
=================================================================================== 1 error in 0.64s ==
@triage-new-issues triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Apr 29, 2020
@lucasrijllart
Copy link

lucasrijllart commented Apr 29, 2020

Hey @omry, I'm getting a different error message but also coming from the build directory issue. Using version pip 20.1 and pytest 3.6.3. I'm running:

pip install -r requirements.txt
pytest

And getting

ImportMismatchError: ('functional.conftest', '/app/tests/build/lib/interaction/functional/conftest.py', local('/app/tests/interaction/functional/conftest.py'))

It seems to me that version 20.0.2 did not create build directories when doing a pip install, and this behaviour has newly arrived in version 20.1.

@pfmoore
Copy link
Member

pfmoore commented Apr 29, 2020

It seems to me that version 20.0.2 did not create build directories when doing an build-in-place

In-place builds are new in 20.1, the reason 20.0.2 did not create a build directory is because the build was done in a temporary location.

@pfmoore
Copy link
Member

pfmoore commented Apr 29, 2020

Presumably the workaround is simply to use the --ignore=build flag to pytest?

@lucasrijllart
Copy link

Thanks for the quick reply @pfmoore, that's worked on my end!

@omry
Copy link
Contributor Author

omry commented Apr 29, 2020

Presumably the workaround is simply to use the --ignore=build flag to pytest?

Yes, or to use pytest.ini:

[pytest]
norecursedirs = build

@omry
Copy link
Contributor Author

omry commented Apr 29, 2020

It seems to me that version 20.0.2 did not create build directories when doing an build-in-place

In-place builds are new in 20.1, the reason 20.0.2 did not create a build directory is because the build was done in a temporary location.

I think it would make sense to build from the current directory but place the output in a temporary location (assuming whatever is consuming that output can also be adjusted).

@McSinyx
Copy link
Contributor

McSinyx commented Apr 29, 2020

@omry, I've just found out about this and hope you like it too: https://discuss.python.org/t/proposal-adding-a-persistent-cache-directory-to-pep-517-hooks/2303

@rksatyam
Copy link

Is there an option to pip to use the temporary location as in 20.0.2?

@pradyunsg

This comment has been minimized.

@hsharrison
Copy link

It's worth noting that this affects more than pytest -- git, docker, linters, IDEs, other test runners, etc. Of course they all have options to ignore a directory, but it's a hassle. It would be better if there was an option (or default) for pip to clean up the build directory after installing.

@McSinyx
Copy link
Contributor

McSinyx commented Apr 30, 2020

@hsharrison, I hope the discussion on IRC yesterday provide you some extra info:

[13:25:17] <cnx> in continuation of the build dir thing, could we please have pip clean?
[13:26:43] <cnx> afaik atm pip only create build/ during build process, but it'd be easier for users to be able to have `pip install . && pip clean .` or `pip install --auto-clean .`
[13:35:48] <toad_polo> cnx: pip doesn't create that directory.
[13:35:55] <toad_polo> It wouldn't know what to clean.
[13:37:41] <toad_polo> You're probably best off using `.gitignore` to exclude `build/`, then running `git clean` to remove files not tracked by git.

Per my understanding, pip does not know what the build backend does and what they create in the project directory, so it cannot clean them. The previous behavior of copy everything to a tmp dir and build them there might not pollute the dir, however the implementation was not perfect (and there was not yet any standard to guide it), i.e. there are edge cases where it's broken (GH-7555) and multi-stage build was not possible.

@hsharrison
Copy link

Hmm, so I suppose the solution in Dockerfiles is pip install . && rm -r build

@omry
Copy link
Contributor Author

omry commented Apr 30, 2020

Proposed solution:

  1. By default, pip creates the build directory in a temporary dir (just the output, input can still be the source tree).
  2. pip config allow the definition of a persistent build dir for cases (future cases?) when we need incremental build.

@uranusjr
Copy link
Member

@omry Your item 1. in the proposal is exactly what the thread @McSinyx mentioned is about.

@McSinyx
Copy link
Contributor

McSinyx commented Apr 30, 2020

@hsharrison, that should work for you, assuming you're using setuptools as the build backend. However, it's worth noting that different build backends may generate different junks in different places and I think pip will wait until there's a better standard specifying the expected behavior.

@MichaelClerx
Copy link

MichaelClerx commented Apr 30, 2020

This is causing style tests to suddenly fail on two projects I'm involved with, which both use a pip install and then run flake8

It's worth noting that this affects more than pytest -- git, docker, linters, IDEs, other test runners, etc. Of course they all have options to ignore a directory, but it's a hassle. It would be better if there was an option (or default) for pip to clean up the build directory after installing.

Strongly agree with this!

@thetic
Copy link

thetic commented Apr 30, 2020

What is the expected behavior of the --build flag in this situation?

@interifter
Copy link

interifter commented Apr 30, 2020

@thetic Seconded

@hsharrison seconded

this is breaking us for more than just pytest. And we have an entire repo named "build" that doesn't play nicely with this new capability.

And it appears the -b/--build option is not honored during creation of the build/ directory.
Instructions for pip install --help appear out of date, too.
from pip install --help:

 -b, --build <dir>           Directory to unpack packages into and build in. Note that an initial build still takes place in a temporary directory. The location of temporary directories can be controlled by setting the TMPDIR
                              environment variable (TEMP on Windows) appropriately. When passed, build directories are not cleaned in case of failures.

Yet, passing in any -b/--build value doesn't seem to influence anything.

[edit]: Can we get -b/--build to honor input into it? That way we can at least choose the name of the build output directory.

@dholth
Copy link
Member

dholth commented May 1, 2020

So setuptools (setup.py) is making the folder when setup.py bdist_wheel runs. Did pip ever build non isolated in a much older version?

@antocuni
Copy link

antocuni commented May 6, 2020

I found another problem caused by this issue: it breaks manylinux builds which are based on the official manylinux demo project. It creates wheels which can't be installed because of ABI incompatibility!

The demo iterates over many different pythons and runs pip wheel . repeatedly. The problem arises when the following happens:

  1. it runs /opt/python/cp27-cp27m/bin/pip wheel .: this creates a build directory in the cwd

  2. it runs /opt/python/cp27-cp27mu/bin/pip wheel .: now it finds the build dir of the previous step but somehow it fails to recognize that it was created by an ABI-incompatible version of python (m vs mu): the result is that it creates a binary wheel which can't be installed on the python that you just use to compile it!

For a minimal example, look at this gist: it runs pip wheel twice, once with cp27-cp27m and one with cp27-cp27mu, then it tries to install/import the extension it just built. You can run it by doing this:

$ git clone https://gist.github.com/antocuni/a1404859da8a01609c90b1bb76ad5d95
$ cd a1404859da8a01609c90b1bb76ad5d95/
$ docker run --rm -v `pwd`:/foo quay.io/pypa/manylinux1_x86_64 bash /foo/build-wheels.sh

cp27-cp27m
Python 2.7.18
pip 20.1 from /opt/_internal/cpython-2.7.18-ucs2/lib/python2.7/site-packages/pip (python 2.7)
setuptools 44.1.0

creating wheel
[...]
Processing /foo
Building wheels for collected packages: foo
  Building wheel for foo (setup.py): started
  Building wheel for foo (setup.py): finished with status 'done'
  Created wheel for foo: filename=foo-0.0.0-cp27-cp27m-linux_x86_64.whl size=7748 sha256=3ec784c6363005c118d14270f24f961b3d772e10127e13772c94d9feaf9d2a99
  Stored in directory: /tmp/pip-ephem-wheel-cache-etjxV2/wheels/d5/3f/b9/c4e4497118b8a29cdc143f0da42ab610e9ddbd1025b3977721
Successfully built foo

inspecting /wheelhouse/foo-*-cp27-cp27m-*.whl
/tmp/tmp.fJKcRbpE29 /
0000000000000000      D  *UND*	0000000000000000              PyUnicodeUCS2_FromString
[...]
Processing /wheelhouse/foo-0.0.0-cp27-cp27m-linux_x86_64.whl
Installing collected packages: foo
Successfully installed foo-0.0.0
/


cp27-cp27mu
Python 2.7.18
pip 20.1 from /opt/_internal/cpython-2.7.18-ucs4/lib/python2.7/site-packages/pip (python 2.7)
setuptools 44.1.0

creating wheel
[...]
Processing /foo
Building wheels for collected packages: foo
  Building wheel for foo (setup.py): started
  Building wheel for foo (setup.py): finished with status 'done'
  Created wheel for foo: filename=foo-0.0.0-cp27-cp27mu-linux_x86_64.whl size=7751 sha256=3a093be33cb6e4435a60cfd00f9df705f3fe3398a721b767371f40d8d3fef20f
  Stored in directory: /tmp/pip-ephem-wheel-cache-9RXS7a/wheels/d5/3f/b9/c4e4497118b8a29cdc143f0da42ab610e9ddbd1025b3977721
Successfully built foo

inspecting /wheelhouse/foo-*-cp27-cp27mu-*.whl
/tmp/tmp.PLvGnWGz55 /
0000000000000000      D  *UND*	0000000000000000              PyUnicodeUCS2_FromString
[...]
Processing /wheelhouse/foo-0.0.0-cp27-cp27mu-linux_x86_64.whl
Installing collected packages: foo
Successfully installed foo-0.0.0
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: ./foo.so: undefined symbol: PyUnicodeUCS2_FromString

This example (and all the travis-based projects which are based on manylinux-demo) still works with quay.io/pypa/manylinux1_x86_64:2020-04-25-37c204c which is the last to include pip 20.0.2 but it's broken by all subsequent docker images which include pip 20.1.

If you manually downgrade to pip==20.0.2 or remove the build directory between the builds, then the example works again.

I expect that many projects will start to automatically upload broken wheels as soon as they trigger a new travis-based version release, so it's probably better to fix it soon.

@pradyunsg
Copy link
Member

Thanks for all the reports folks! For anyone who hasn't been following this closely, a bunch of related discussion has taken place at #7555 (comment).

If you're wondering why this change was made: pip's copy-to-a-temporary-directory-and-build-there logic has been a source of a lot of bugs (performance issues, correctness issues and more) over the years. We made this change to improve the user experience in this area, to solve those issues by removing that band-aid solution for something much simpler and less fragile. With hindsight, I think we pulled the band-aid out a bit too early/quickly. Although we expected there to be some workflows that'd break, we'd expected the breakages would occur in more niche usecases, and not be as broadly affecting as they have been.

In summary, this change is being lot more disruptive than we had initially anticipated and we'll be reverting the "build in place" behavior change in a pip 20.1.1 release. We'll be pursuing a different avenues to push build tools to stop making the assumption that we've broken with this change (that they can put whatever crap I want in the package's root directory), likely by collaborating on mechanisms like the one proposed in https://discuss.python.org/t/2303/.

@pradyunsg
Copy link
Member

pradyunsg commented May 7, 2020

I'm really curious how folks here, think we should communicate around potentially disruptive stuff like this. We had a beta period for getting feedback on this change, that was publicized an order of magnitude more than we've ever done in the past, and we still got no reports from users that would even hint at how disruptive this could be to their workflows.

We've been bitten by these things in the past (like back in 2018, with pip 10.0) and if you have any thoughts or feedback on how we can do communication around changes, and get feedback on those changes, please chime in at #7628. Everything from "here's where I go to look for information, to I don't expect pip to ever change, to I don't look for information ever" is useful input to us.

A personal note

I still believe that none of the issues that have been flagged here are "pip's fault" -- they're occurring because other tools (setuptools, pytest etc) are making incorrect assumptions about what the contents of a package directory are, and what they can do with that directory. Obviously, practicality beats purity here, and reverting this change and figuring out a significantly smoother approach is the right thing to do. pip isn't generating the build/ directory, setuptools is. Has setuptools always done that, heck yes.

But like, why would tooling not ignore a build directory, that has been used for years by setuptools -- which has been the de-facto standard for years. Are people not putting build/ in their .gitignore anymore? -- GitHub's default certainly has it. Is someone's "Python IDE" or "Python test runner" not configured to ignore build/? How is that a bug with pip, and not their tooling? Are users not willing to add a line to their docker containers, to do the right thing, instead of pushing more complexity into "digital infrastructure" tooling? How is that OK?

This feels a lot like #5599 -- Linux distros made an incorrect assumption, and a change in pip exposed that. What happens next? Lots of users reporting errors to pip, instead of their distro, and many Linux distros still don't fix that assumption). I'm starting to really dislike not being able to make improvements to pip because "other stuff" is making assumptions about how pip's internal processes work.

@pradyunsg pradyunsg added the project: <downstream> When the cause/effect is related to redistributors label May 7, 2020
@pradyunsg pradyunsg added project: setuptools Related to setuptools type: support User Support labels May 7, 2020
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label May 7, 2020
@omry
Copy link
Contributor Author

omry commented May 7, 2020

That's a hard one.
For some changes you can add a deprecation warning for a while when user is doing something that are not (longer) supposed to be doing.
The problem with this one is that there is no way to gradually warn about it.

One possibility is something along the lines of asking people to explicitly opt into the new flow by introducing a command line flag / config file option.
This can be accompanied by an active hint to the user when they run without making a choice.

The warning might look like:

pip is moving toward inplace build method which may be disruptive in some cases.
We are collecting feedback to improve the feature. Please opt in to the new flow using the inplace=true config option or --inplace=true command line flag and report any issues.

We are planning on making this the default in a future version, and eventually deprecate and remove the current approach (--inplace=false).

To turn off this warning, make an explicit selection in your command line or config file.
If you disable this, please let us know via <preferred communication channel>.

@dholth
Copy link
Member

dholth commented May 7, 2020

I am also amazed that the build directory was so surprising.

@MichaelClerx
Copy link

MichaelClerx commented May 7, 2020

As a user, I didn't expect there to be any "building" at all. Just some copying of files into a site-packages dir somewhere. I can understand that there's more going on and that some temporary "build" path is needed, but that still seems like a pip implementation detail to me, and not something users should be aware of? Also I find it very hard to see leaving the dir there instead of deleting it as anything other than a side effect. I'm asking pip to install from a certain source folder into a system: I expect that to modify the system but not the source folder.

Thanks for how you've handled this, by the way. All your work on pip makes my life much easier! So just trying to help out with a user's perspective here

@dholth
Copy link
Member

dholth commented May 7, 2020

There's always building. People don't expect this for their .py files but it happens.

@interifter
Copy link

So, still curious... what is the point of -b? It seems like it does not alter any behavior of the permanent ‘build/‘ artifact. Could it be repurposed to specify the name of that dir?

@dstufft
Copy link
Member

dstufft commented May 7, 2020

I will say, I think that most or all of the breakages we ran into came up during the PEP 517 as things that were likely to be possible sources of breakages (for example). Maybe we need to do a better job of summarizing these discussions or when making changes related to past discussions going back and reviewing past discussions or something.

I've long believed that beta releases for tools like pip are of marginal benefit. When we've tried to do them in the past we've rarely caught many if any issues, even show stopping ones, until a final release was cut. That's why I stopped doing them when I was still releasing pip and instead tried to be available to quickly react to broken releases. Attempting to phase large scale changes in over time is also a good pattern (with many ways to handle it like dark reads, opt in or opt out flags, etc).

I don't think "blame" is the most useful thing to focus on. One could easily argue that everything was working fine until pip broke backwards compatibility by no longer ensuring builds were done in an ephemeral location. Unfortunately Python's packaging has hit Hyrum's Law to a pretty large degree, so it's super common that anytime we make any changes to long standing behavior we're going to be breaking compatibility with something. The only real solution to it is to be more deliberate and careful with rollback plans and provide ways to work round etc the more risky a change is, but to fundamentally accept that these things are going to happen as well.

@pradyunsg pradyunsg removed the project: setuptools Related to setuptools label May 7, 2020
@pradyunsg
Copy link
Member

pradyunsg commented May 7, 2020

--build

That's definitely an oversight in this case that didn't help -- its doc-string should've been updated but didn't get updated. :)

Note that an initial build still takes place in a temporary directory. The location of temporary directories can be controlled by setting the TMPDIR environment variable (TEMP on Windows) appropriately.

should have changed to:

Note that an initial build still happens in-place.

@pradyunsg
Copy link
Member

(coming back to this, after stepping away for a few days)

I don't think "blame" is the most useful thing to focus on.

Agreed. That was perhaps my frustration speaking, more than anything else.

Unfortunately Python's packaging has hit Hyrum's Law to a pretty large degree

TIL there's an internet law, for describing this situation!

The only real solution to it is to be more deliberate and careful with rollback plans and provide ways to work round etc the more risky a change is, but to fundamentally accept that these things are going to happen as well.

Hmph. I agree that this is likely the only solution, but... I'll still occasionally get grumpy that we're in this situation. :)

@pradyunsg
Copy link
Member

The only real solution to it is to be more deliberate and careful with rollback plans and provide ways to work round etc the more risky a change is

So, some background work has been done on this front and #2195 (comment) contains a proposal for us to reintroduce this with a better rollout plan. :)


Closing this issue since the fix was included in 20.1.1 and that's been out for... months. If you want to follow future updates, please look at #7555 and #2195. :)

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
project: <downstream> When the cause/effect is related to redistributors type: support User Support
Projects
None yet
Development

No branches or pull requests