speed-up by early pruning based on markers #4952

maciejskorski · 2021-12-29T12:20:52Z

I have searched the issues of this repo and believe that this is not a duplicate.
I have searched the documentation and believe that my question is not covered.

Feature Request

As of now poetry spends a lot of time on overrides which doesn't match the environmental version or platform requirements.
They are filtered in the very end of the process, after writing possible scenarios to the lock file. It would be great to have early pruning of such overrides, particularly as more and more packages go for conditional logic in their requriements.
The issue fits the broader call for better support of markers, such as #4670

Consider this example run within a virtual environment with Python 3.8

[tool.poetry.dependencies]
python = "~3.8"
pandas = [
    {version="1.3.*",markers="python_version=='3.9.*'"},
    {version="1.2.*",markers="python_version=='3.7.*'"},
    {version="1.1.*",markers="python_version=='3.10.*'"},
]
....

Petry correctly concludes the empty set of matching packages, however it splits the graph in 6 branches (due to explicit version split and one more implicit in pandas numpy dependency) - and this escalates the more markers we have. It would help to compare the markers early and ignore some of the branches.

NOTE: this is a broader call for being able to support restricted target environments and narrow the lock to prescribed architectures, python versions and so on.

The text was updated successfully, but these errors were encountered:

maciejskorski · 2021-12-30T10:09:15Z

Here are my further thoughts once I got familiar with the code:

Install configs, written to the .lock file are platform agnostic, e.g. cover all possible scenarios. This exponentially impacts the running time in some scenarios (combine pandas, black, and azureml in its modern releases).

Adding pre-filtering to match the target markers seems easy, but it's going to change the way log file looks like so much more work on the tests side - as they test not only the picked up instructions but also the content file (hash included)

radoering · 2021-12-30T19:15:37Z

The example is only a special case of the issue, isn't it? Actually, in the example all of the pandas constraints should be ignored because they don't match the project's python constraint. (I would almost consider it a bug that this currently is not the case.) If the python constraint would be "^3.7" instead of "~3.8", it would be correct to consider all pandas overrides when locking (no matter in which environment).

I think, in the example you can even force the desired behaviour by python instead of markers:

[tool.poetry.dependencies]
python = "~3.8"
pandas = [
    {version="1.3.*",python="3.9.*"},
    {version="1.2.*",python="3.7.*"},
    {version="1.1.*",python="3.10.*"},
]

When using this syntax, the pandas constraints are ignored...

In my opinion, creating one lock file for all platforms and python versions on an arbitrary platform with one of the supported python versions (together with multiple constraint dependencies in the pyproject.toml) is one of the most important features of poetry. Further, once the lock file has been created, installation does not require locking again.

maciejskorski · 2022-01-01T18:39:36Z

The example is only a special case of the issue, isn't it?

Yes, of course. We want to pre-filter more important markers such as OS or architecture. They often appear in modern ML stack (recent releases of pandas, azure, black...). Sorry if that was not clear, I can bring up a more devastating example:-)

I think, in the example you can even force the desired behaviour by python instead of markers:

Not really, the behaviour of python constraints is known to have bugs when markers appear. I really want to support markers.
Like I said, markers appear in data-science or ML stack and supporting them is important.

In my opinion, creating one lock file for all platforms and python versions on an arbitrary platform with one of the supported python versions (together with multiple constraint dependencies in the pyproject.toml) is one of the most important features of poetry. Further, once the lock file has been created, installation does not require locking again.

That's the case of mature packages. I want to support coders which develop internal dependencies and debug intensively, within a team, switching and updating dependencies often. Then updating or recreating happens often and is really time-consuming.

Yes this is great, that the lock is system agnostic:-)
However, this makes poetry slow due to exponentially many possibilities when building bigger packages.
Use-cases are in between: say we develop for restricted platforms or restricted architectures, restricted python versions or other subset of markers. This is a call for constraining the target environment for valid reasons, could be done in the toml file as in the PR #4956. Of course, as an option. Let other enjoy the poetry being environment agnostic :-)

hungbie · 2022-06-02T02:20:48Z

+1 for this. We have a big project in our team, each time poetry solves the dependencies it already takes more than 10 minutes (each try). However, we know beforehand the version we want as well as the platform, so retries to have a solution that valid everywhere is redundant and a big waste of resources. Currently in our project we have to retries around 20 times, making the whole process super slow, sometimes can be upto 4 hours.

jfyu · 2022-08-16T11:43:14Z

+1 for this as well. pandas and numpy are ubiquitous to data science and this issue is slowing down our microservice project build.

maciejskorski added kind/feature Feature requests/implementations status/triage This issue needs to be triaged labels Dec 29, 2021

radoering mentioned this issue Dec 31, 2021

Ignore dependencies not matching the project's python constraint... #4958

Closed

2 tasks

maciejskorski mentioned this issue Dec 31, 2021

[optimize] early-ignore dependencies not matching chosen markers #4956

Open

2 tasks

maciejskorski changed the title ~~early prune based on markers~~ speed-up by early pruning based on markers Dec 31, 2021

dimbleby mentioned this issue Jul 17, 2022

Multiple constraints on same package cause O(exp(N)) checks #5121

Open

dimbleby mentioned this issue Aug 4, 2022

Platform as a global configuration #6118

Closed

2 tasks

ralbertazzi mentioned this issue May 5, 2023

Instructions for installing PyTorch #6409

Open

1 task

dimbleby mentioned this issue Nov 18, 2023

Parse wheel requirements Requires-Dist with platform markers | Torch 2.1.1+cu121 installation problem #8690

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speed-up by early pruning based on markers #4952

speed-up by early pruning based on markers #4952

maciejskorski commented Dec 29, 2021 •

edited

Loading

maciejskorski commented Dec 30, 2021 •

edited

Loading

radoering commented Dec 30, 2021

maciejskorski commented Jan 1, 2022 •

edited

Loading

hungbie commented Jun 2, 2022

jfyu commented Aug 16, 2022

speed-up by early pruning based on markers #4952

speed-up by early pruning based on markers #4952

Comments

maciejskorski commented Dec 29, 2021 • edited Loading

Feature Request

maciejskorski commented Dec 30, 2021 • edited Loading

radoering commented Dec 30, 2021

maciejskorski commented Jan 1, 2022 • edited Loading

hungbie commented Jun 2, 2022

jfyu commented Aug 16, 2022

maciejskorski commented Dec 29, 2021 •

edited

Loading

maciejskorski commented Dec 30, 2021 •

edited

Loading

maciejskorski commented Jan 1, 2022 •

edited

Loading