-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #2478: Topological installation order #2616
Conversation
16c88e4
to
64da790
Compare
I don't understand why we need a topological ordering of installs since it has been agreed that setup_requires won't end up in the target environment ? |
name='TopoRequires4', | ||
version='0.0.1', | ||
packages=['toporequires4'], | ||
install_requires=['TopoRequires2', 'TopoRequires3', 'TopoRequires4'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per your readme update, the 3rd item should be "TopoRequires", right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nuts,yes.
@xavfernandez this has nothing to do with setup_requires. Topological sorting is an existing feature request predating my work on setup_requires, but which increases the reliability of building and installing things - because there are setup.py's where egg_info can be run, but install/bdist_wheel etc can fail if the dependencies have not been installed. Because we can't have nice things. This is why folk have had to tweak install orders in the past, because pip was installing in 'first-found' or 'last-found' order, neither of which solve this generally. |
This is the goal of I'm afraid implementing topological ordering will send this message to users: "You don't need to declare |
I don't dispute that. But if the pip committers didn't want topological sorting they would have closed that issue WONTFIX, right? |
+1 on toposort. No one should have to manually order their dependencies. |
No one has. People only have to use the tools correctly (i.e. use This is even less useful/desirable with wheel-caching coming: it will be much less painful to have |
Topological sorting only means that all of a package's requirements get installed before that package is installed. It does not have anything to do with setup requirements. |
practically speaking, it does. to say again, this PR intends to resolve #2478, which was motivated by a user wanting a topo sort for the example of "h5py" needing "cython" to run a build extension. A topo install is not needed if "h5py" setup-requires "cython". Although much of the conversation for "setup-requires" has focused around wanting to do module level imports in setup.py itself, this scenario of needing requirements in place for build extensions is something that is literally mentioned in the setuptools docs: " [setup_requires] is needed if you are using distutils extensions as part of your build process". |
To put it another way: Any given pip installation should be seen as atomic - everything requested (and their dependencies) gets installed at once, or not at all. The order pip does the tasks it needs to to achieve that is an implementation detail, and should not matter. If it does, that is a bug in the packages and how they have specified their requirements. All the user should have to care about is that when they do |
... which is not to say that this change won't help users deal with broken package specs, but that's all it does. |
On its own, this is a fix for a regression vs master, as it passes tests. It is needed for topological handling, as we need to build a dependency graph, and can't do that without resolving unnamed -> named dependencies before adding the depended-on requirements.
This is for reuse in the declarative requires patch.
Ok, so I think I've figured out some of the cognitive dissonance here. Coming from a distro background it didn't even occur to me until I got up this morning. Package managers like apt and yum (and their deeper plumbing, dpkg and rpm) try very hard to avoid leaving the users system in a broken state. pip also has this goal - see for instance @dstufft 's requirement that we don't install anything until we know we can resolve all the dependencies satisfactorily. The reason for topological sorting then, is not so that buggy setup_requires will work, but correctness: we should not have packages installed without their dependencies. topological sorting is a first approximation to this goal. A second, closer approximation would involve attempting to maintain axioms across all installed packages e.g. when A and B both have versioned deps on C, don't allow an upgrade or install or B if that would break the versioned dep A holds, even transiently. This can be made easier by a third thing - atomic replacement of packages, which requires making the installation and removal of packages atomic, which its not today. But I don't want to get into perfection here; acting in topological order is clearly much closer to 'all axioms maintained at all times' than installing in either first-discovered or last-discovered orders. |
In particular, a problem that happens is if you do There are of course other ways of handling that, like making An option to be more "atomic", is to have the other PR that @rbtcollins is working on (the auto build wheel PR) build all the wheels first, before installing any wheels. That would move us into a situation where we ensure we can build everything before we attempt to install anything. That isn't particularly relevant to this PR though, which I think is an entirely reasonable thing to do. |
@dstufft actually we re-evaluate the whole dep chain 'pip install thing-with-missing-dep' will install the missing dep. But its better if thing-with-missing-dep not be installed at all, of course :) |
Thanks, that's indeed a good justification for this PR. I thought pip was able to deal with that in an atomic way and rollback the whole thing. Indeed the solution to build all the wheels first before installing anything seems like an ideal goal 👍 |
This is needed for setup-requires, since without it its possible to cause installation to fail in sort-circuit scenarios such as the added functional test case demonstrates.
what? pip handles this now. pip will see |
I'm remembering some of the details wrong, the issue was #2151 but I don't recall what exactly it was now... but I remember that reversing the installation order so that dependencies got installed before the thing that depended on them solved it. |
more generally, we shouldn't partially fulfill a user's install request. all or nothing, i.e. support rollbacks. but I agree, this is an approximation, and like I said, I don't dislike the topo sort, I'm just still not a fan of the wording in the docs and docstrings. it's going to promote people thinking the sort has to be this way just due to the dep chain itself (i.e. that pip can't install A before B if A install-requires B) |
But pip shouldn't install A before B if A install-requires B, because that isn't a valid install of A. What exact prose would you like in the docstring - I don't want to try and guess at what will be acceptable to you. |
Issue #2478: Topological installation order
as noted in IRC, I'll log another PR with ideas on changing the docs and comments |
This fulfils the request in Issue #2478 - install dependencies in topological order.
The finished order is:
Examples:
pip install foo bar, where foo depends on bar will install bar, then foo.
pip install foo bar, where foo depends on baz will install baz, foo, bar.