Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg_resources merrily adds site-packages to the front of sys.path #6

Open
ghost opened this issue May 29, 2013 · 7 comments
Open

pkg_resources merrily adds site-packages to the front of sys.path #6

ghost opened this issue May 29, 2013 · 7 comments

Comments

@ghost
Copy link

ghost commented May 29, 2013

Originally reported by: ncoghlan (Bitbucket: ncoghlan, GitHub: ncoghlan)


If a requirement in __main__.__requires__ is met by a package in site-packages, pkg_resources will happily reinsert site-packages at the front of sys.path.

This breaks the world when working on software where a previous version is installed as a system package.


@ghost
Copy link
Author

ghost commented May 31, 2013

Original comment by ncoghlan (Bitbucket: ncoghlan, GitHub: ncoghlan):


OK, we think I've narrowed down the cases where pkg_resources will do this:

  1. Somewhere in the dependency chain is at least one dependency which requires a new sys.path entry
  2. Elsewhere in the dependency chain is at least one dependency which is already available on sys.path and doesn't require a new entry.

The second kind of entry will have its sys.path location copied to the front of sys.path, potentially shadowing other modules. We encountered this when making the Sphinx autodoc based documentation builds for Beaker work on both RHEL 6 and Fedora, since the default versions of Sphinx and CherryPy are different, but some of the Sphinx dependencies still have the same default version on both platforms. This meant site-packages was getting a duplicate sys.path entry in front of the source directories, breaking local doc builds if the Beaker libraries were also installed system wide in site-packages.

@ghost
Copy link
Author

ghost commented Jun 1, 2013

Original comment by pje (Bitbucket: pje, GitHub: pje):


If this happens only from __main__.__requires__ that would make it likely that the issue is in the conflict-handling logic at the end of pkg_resources. (The "except VersionConflict" block.)

It seems especially likely since that bit's got the least conservative path manipulation in the entire module. ;-)

The trick would be to find a way to change it that still accomplishes its purpose, which is to work around a conflict with a default version of a library. Right now it does it by punting and saying, "screw it, I'll just build a sys.path that exactly lists what I need, and then tack anything else left over at the end." Putting site-packages near the beginning of sys.path kind of defeats this purpose anyway.

Thinking back, this code was probably written before .egg-info packages even existed, and it looks to me like its logic is completely broken for dealing with them. It should strictly separate .egg/.whl distributions from any other sort, and put them first on the new path list it builds, followed by the others. It also needs to arrange any unencapsulated distributions such that the version conflict doesn't persist, if it can.

@ghost
Copy link
Author

ghost commented Feb 9, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


I've moved "the conflict-handling logic at the end of pkg_resources" to classmethods on the WorkingSet class, specifically _build_master for easier referencing and separation of concerns (no longer is the master working_set constructed in multiple places).

I still don't feel like I have a good grasp of the described problem. It seems there are conditions under which site-packages will be added back to sys.path as part of the _build_from_requirements step, which incidentally causes packages in site-packages to be preferred over those that otherwise would have been selected by the .require() call.

@ncoghlan, can you describe how to create a minimal environment that triggers the undesirable behavior?

@pje, thanks for the advice. You've made some good points. Based on your last paragraph, it sounds like what you're proposing is to do two passes over 'dists', first adding only those that are encapsulated, then adding those that are not. I don't see how this will avoid the issue if any of the encapsulated packages are found in site-packages.

Admittedly, I don't yet understand the nuances of the problem, so I hope Nick will be able to put together a concrete example which can be used to characterize better the failure mode.

@ghost
Copy link
Author

ghost commented Feb 9, 2014

Original comment by ncoghlan (Bitbucket: ncoghlan, GitHub: ncoghlan):


PJE recently pointed out I never summarised the current state of things on
Fedora/RHEL for distutils-sig (and in particular where it can currently be
a bit awkward to work with), so I'll try to write that up later this week.

To trigger this particular case:

  1. Have a project in a directory that is already on sys.path (e.g.
    TurboGears 1), with a dependency on a project not on sys.path (e.g.
    CherryPy 2)
  2. Have that second project installed in unpacked egg form such that
    pkg_resources can find it.
  3. Use main.requires to specify a version constraint on the first
    project (the one that would be available by default, even without
    pkg_resources). The existing sys.path entry will be copied to the front of
    the path.

What seems to happen is a "one in, all in" effect, where if any package
needs a new sys.path entry added, they all get added, even the ones that
are already on sys.path.

@ghost
Copy link
Author

ghost commented Feb 9, 2014

Original comment by pje (Bitbucket: pje, GitHub: pje):


Strictly speaking, what happens is that if any addition to sys.path causes a VersionConflict, the whole thing gets thrown out and started over.

Jason: by "encapsulated" I mean, "contained in its own file or directory, and therefore having a unique sys.path entry value". Thus, "/wherever/site-packages" can never be an "encapsulated" path by this definition, since it can contain encapsulated distribution but is not itself one. If encapsulated distributions are always put first on the updated path, it is impossible for a non-encapsulated (i.e. default) distribution to override it.

@ghost
Copy link
Author

ghost commented May 5, 2015

Original comment by RonnyPfannschmidt (Bitbucket: RonnyPfannschmidt, GitHub: RonnyPfannschmidt):


is this by chance related to #2

@ghost
Copy link
Author

ghost commented May 12, 2015

Original comment by ncoghlan (Bitbucket: ncoghlan, GitHub: ncoghlan):


While I investigated both this and #2 around the same time and in the context of the same project (beaker-project.org), they seemed to be unrelated.

@ghost ghost added major bug labels Mar 29, 2016
jaraco added a commit that referenced this issue Jul 12, 2020
Allow spawn to accept environment. Avoid monkey-patching global state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

0 participants