-
-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved workflows for Torch (and Tensorflow?) #18965
Comments
As I pointed out to @tgolsson offline, a multi-platform lock example (Pants does not support these) for this case looks like so:
In real life you'd almost certainly want to use a pair of The result is a lock file containing 2 locked resolves: {
"allow_builds": true,
"allow_prereleases": false,
"allow_wheels": true,
"build_isolation": true,
"constraints": [],
"locked_resolves": [
{
"locked_requirements": [
{
"artifacts": [
{
"algorithm": "sha256",
"hash": "50fd9bf85c578c871c28f1cb0ace9dfc6024401c7f399b174fb0f370899f4454",
"url": "https://download.pytorch.org/whl/cpu/torch-1.11.0-cp39-none-macosx_10_9_x86_64.whl"
}
],
"project_name": "torch",
"requires_dists": [
"typing-extensions"
],
"requires_python": ">=3.7.0",
"version": "1.11.0"
},
{
"artifacts": [
{
"algorithm": "sha256",
"hash": "fb33085c39dd998ac16d1431ebc293a8b3eedd00fd4a32de0ff79002c19511b4",
"url": "https://files.pythonhosted.org/packages/31/25/5abcd82372d3d4a3932e1fa8c3dbf9efac10cc7c0d16e78467460571b404/typing_extensions-4.5.0-py3-none-any.whl"
}
],
"project_name": "typing-extensions",
"requires_dists": [],
"requires_python": ">=3.7",
"version": "4.5.0"
}
],
"platform_tag": [
"cp39",
"cp39",
"macosx_10_9_x86_64"
]
},
{
"locked_requirements": [
{
"artifacts": [
{
"algorithm": "sha256",
"hash": "544c13ef120531ec2f28a3c858c06e600d514a6dfe09b4dd6fd0262088dd2fa3",
"url": "https://download.pytorch.org/whl/cpu/torch-1.11.0%2Bcpu-cp39-cp39-linux_x86_64.whl"
}
],
"project_name": "torch",
"requires_dists": [
"typing-extensions"
],
"requires_python": ">=3.7.0",
"version": "1.11.0+cpu"
},
{
"artifacts": [
{
"algorithm": "sha256",
"hash": "fb33085c39dd998ac16d1431ebc293a8b3eedd00fd4a32de0ff79002c19511b4",
"url": "https://files.pythonhosted.org/packages/31/25/5abcd82372d3d4a3932e1fa8c3dbf9efac10cc7c0d16e78467460571b404/typing_extensions-4.5.0-py3-none-any.whl"
}
],
"project_name": "typing-extensions",
"requires_dists": [],
"requires_python": ">=3.7",
"version": "4.5.0"
}
],
"platform_tag": [
"cp39",
"cp39",
"linux_x86_64"
]
}
],
"path_mappings": {},
"pex_version": "2.1.136",
"pip_version": "23.1.2",
"prefer_older_binary": false,
"requirements": [
"torch==1.11.0+cpu; platform_system != \"Darwin\"",
"torch===1.11.0; platform_system == \"Darwin\""
],
"requires_python": [],
"resolver_version": "pip-2020-resolver",
"style": "strict",
"target_systems": [],
"transitive": true,
"use_pep517": null
} Pex has the smarts to pick the best fitting lock - if any - at use time. |
@jsirois Thank you for the clarification! It even works when not requiring local versions.
Which still gives the proper selection, doesn't require equality matching. With multiple index it'd require per-resolve indexes though, but I think it'd be nice. Having to use {
"allow_builds": true,
"allow_prereleases": false,
"allow_wheels": true,
"build_isolation": true,
"constraints": [],
"locked_resolves": [
{
"locked_requirements": [
{
"artifacts": [
{
"algorithm": "sha256",
"hash": "50fd9bf85c578c871c28f1cb0ace9dfc6024401c7f399b174fb0f370899f4454",
"url": "https://download.pytorch.org/whl/cpu/torch-1.11.0-cp39-none-macosx_10_9_x86_64.whl"
}
],
"project_name": "torch",
"requires_dists": [
"typing-extensions"
],
"requires_python": ">=3.7.0",
"version": "1.11.0"
},
{
"artifacts": [
{
"algorithm": "sha256",
"hash": "fb33085c39dd998ac16d1431ebc293a8b3eedd00fd4a32de0ff79002c19511b4",
"url": "https://files.pythonhosted.org/packages/31/25/5abcd82372d3d4a3932e1fa8c3dbf9efac10cc7c0d16e78467460571b404/typing_extensions-4.5.0-py3-none-any.whl"
}
],
"project_name": "typing-extensions",
"requires_dists": [],
"requires_python": ">=3.7",
"version": "4.5.0"
}
],
"platform_tag": [
"cp39",
"cp39",
"macosx_10_9_x86_64"
]
},
{
"locked_requirements": [
{
"artifacts": [
{
"algorithm": "sha256",
"hash": "544c13ef120531ec2f28a3c858c06e600d514a6dfe09b4dd6fd0262088dd2fa3",
"url": "https://download.pytorch.org/whl/cpu/torch-1.11.0%2Bcpu-cp39-cp39-linux_x86_64.whl"
}
],
"project_name": "torch",
"requires_dists": [
"typing-extensions"
],
"requires_python": ">=3.7.0",
"version": "1.11.0+cpu"
},
{
"artifacts": [
{
"algorithm": "sha256",
"hash": "fb33085c39dd998ac16d1431ebc293a8b3eedd00fd4a32de0ff79002c19511b4",
"url": "https://files.pythonhosted.org/packages/31/25/5abcd82372d3d4a3932e1fa8c3dbf9efac10cc7c0d16e78467460571b404/typing_extensions-4.5.0-py3-none-any.whl"
}
],
"project_name": "typing-extensions",
"requires_dists": [],
"requires_python": ">=3.7",
"version": "4.5.0"
}
],
"platform_tag": [
"cp39",
"cp39",
"linux_x86_64"
]
}
],
"path_mappings": {},
"pex_version": "2.1.136",
"pip_version": "23.1.2",
"prefer_older_binary": false,
"requirements": [
"torch<1.12.0,>=1.11.0"
],
"requires_python": [],
"resolver_version": "pip-2020-resolver",
"style": "strict",
"target_systems": [],
"transitive": true,
"use_pep517": null
} The question is if we can "shorthand" the --platform arguments as well? I'm thinking about onboarding/ease-of-adoption... |
Well, the problem is worse. As I mentioned, |
The only way to do this without platforms is to allow building up a lock instead of creating it all at once. Basically lock on 1 machine, now lock on another, now combine. In that sort of procedure you never need to spell out the complete platform details since you're running the lock operation natively on the target platform. For that convenience though, you buy the thorny problem of combining the results. |
See "Rationale" item 2 for one idea how to handle combination of multiple locks. The idea expressed in this rejected PEP is to use a lock directory with many files instead of a single lock file with many locked resolves like PEX currently does it: https://discuss.python.org/t/pep-665-specifying-installation-requirements-for-python-projects/9911 |
@jsirois Gotcha. I was naively thinking that some information could be derived from the interpreter constraints, but that's only one axis (the others being ISA and OS). But as far as I can tell there's already support for |
@tgolsson this is exactly what is needed to resolve for Python:
1 is easy. 2 and 3 are not. Here's what Pex does for 2 when you only give it Now you seem to be grumbling about this "abbreviated" form! And this form is as "nice" as it gets, and totally insufficient and should actually be killed. I at least have the abbreviated platforms now causing resolves using them to raise whenever they encounter unset environment marker values (used to silently "succeed" with wrong resolve results). The complete platform is really and truly the only way to go. You can use Pex's {
"marker_environment": {
"implementation_name": "cpython",
"implementation_version": "3.11.0",
"os_name": "nt",
"platform_machine": "AMD64",
"platform_python_implementation": "CPython",
"platform_release": "10",
"platform_system": "Windows",
"platform_version": "10.0.22621",
"python_full_version": "3.11.0",
"python_version": "3.11",
"sys_platform": "win32"
},
"compatible_tags": [
"cp311-cp311-win_amd64",
"cp311-abi3-win_amd64",
"cp311-none-win_amd64",
"cp310-abi3-win_amd64",
"cp39-abi3-win_amd64",
"cp38-abi3-win_amd64",
"cp37-abi3-win_amd64",
"cp36-abi3-win_amd64",
"cp35-abi3-win_amd64",
"cp34-abi3-win_amd64",
"cp33-abi3-win_amd64",
"cp32-abi3-win_amd64",
"py311-none-win_amd64",
"py3-none-win_amd64",
"py310-none-win_amd64",
"py39-none-win_amd64",
"py38-none-win_amd64",
"py37-none-win_amd64",
"py36-none-win_amd64",
"py35-none-win_amd64",
"py34-none-win_amd64",
"py33-none-win_amd64",
"py32-none-win_amd64",
"py31-none-win_amd64",
"py30-none-win_amd64",
"py311-none-any",
"py3-none-any",
"py310-none-any",
"py39-none-any",
"py38-none-any",
"py37-none-any",
"py36-none-any",
"py35-none-any",
"py34-none-any",
"py33-none-any",
"py32-none-any",
"py31-none-any",
"py30-none-any"
]
} N.B.: Windows has the smallest compatible tags lists. Linux tends towards ~700 tags for modern glibc machines. The command to get that:
|
As far as using the existing platforms (and complete_platforms) support in Pants, yes - that's the data needed to use In Pex parlance it's The The universal style only works via interpreter constraints and optional |
Alright, trying to keep up here!
Yepp indeed! It looks like both
Makes sense. I assume when you say I think there's a bit of a conflict between what I expect from a tool like Pants in terms of abstraction level of workflows - high gain for low effort - and pex maybe focusing more on explicitness and correctness. And that's not an easy thing to resolve. FWIW from my perspective, I think there's a fundamental disconnect between what I'm configuring ("use torch") and the result ("only works on Linux") when I add an extra index elsewhere. Whatever solution we go for; I think having the option to use "non-exact" version constraints is also an important part, as in my I added the # pants.toml
[python-repos]
indexes = [
"https://pypi.org/simple/",
"https://download.pytorch.org/whl/cpu/",
] # BUILD
python_requirement(
name="torch",
requirements=["torch>=1.11.0,<1.12"],
resolve="cpu",
) This would lock # pants.toml
[python-repos]
indexes = [
"https://pypi.org/simple/",
"https://download.pytorch.org/whl/cpu/",
"https://download.pytorch.org/whl/cu115/",
] Because I think at a minimum it'd have to look like: # BUILD
python_requirement(
name="torch",
requirements=["torch>=1.11.0,<1.12,!=1.11.*+cu115"],
resolve="cpu",
)
python_requirement(
name="torch",
requirements=["torch>=1.11.0,<1.12,!=1.11.*+cpu"],
resolve="gpu",
) Alternatively; # BUILD
python_requirement(
name="torch",
requirements=[
'torch==1.11.*+cpu ; platform_system != "Darwin"'
'torch==1.11.* ; platform_system == "Darwin"'
],
resolve="cpu",
)
python_requirement(
name="torch",
requirements=[
'torch==1.11.*+cu115 ; platform_system != "Darwin"'
'torch==1.11.* ; platform_system == "Darwin"'
],
resolve="gpu",
) But I think that's even worse (though less exponential). I'm also not sure if star-versions like that work with local versions. |
All changes: - https://github.com/pantsbuild/pex/releases/tag/v2.1.153 - https://github.com/pantsbuild/pex/releases/tag/v2.1.154 - https://github.com/pantsbuild/pex/releases/tag/v2.1.155 Highlights: - `--no-pre-install-wheels` (and `--max-install-jobs`) that likely helps with: - #15062 - (the root cause of) #20227 - _maybe_ arguably #18293, #18965, #19681 - improved shebang selection, helping with #19514, but probably not the full solution (#19925) - performance improvements
https://github.com/pantsbuild/pex/releases/tag/v2.1.156 Continuing from #20347, this brings additional performance optimisations, particularly for large wheels like PyTorch, and so may help with #18293, #18965, #19681
This thread is intended to focus on actionable solutions for #18293, potentially multi-platform Pex locks, maybe with other components as well.
Is your feature request related to a problem? Please describe.
The big neural network libraries Tensorflow and Pytorch have significant hurdles in usage based on how they're versioned. In general, the basic wheels published to PyPi aren't usable by default, and any important usage requires custom indexes and compute-API-tagged builds. In the case of
tensorflow
these are differently named packages, and in the case oftorch
they are differentiated by local versions. These specific packages also differ in what platforms they support, and the local versions may not exist for all platforms.Currently, the universal lock provided by Pex specifies a maximum set of
{Linux, Mac}
but makes no guarantee either platform will be supported. This compounds the complexity of version specification, as local versions shall be preferred. In practice, one has to jump through hoops to ensure a basic resolve works. For example:This seems like we'd end up with one lock that works on both Mac (CPU) and Linux (CPU + random CUDA), and one for Linux with only CPU support. However; due to the PEP440 requirements both locks end up picking +CPU, and having no install candidates for Mac. The proper requirement is to specify it like this:
This means that the more different compute variants you need the more specific the generic constraints have to be. If we allowed Pex to separate the versions between Mac and Linux (and Windows?) we would have an easier time ensuring something reasonable gets picked. However; even so it'd be required to exclude all available versions in each resolve as we'd not want a massive
+cuda
result in the generic for Linux.Describe the solution you'd like
The thread on Slack discusses multi-platform locks as a potential solution, but I think that still runs afoul of some footguns with the local-version specifiers as one still needs to exclude them. It might therefore be necessary to combine them with other solutions such as per-resolve extra-indexes.
Additional context
https://pantsbuild.slack.com/archives/C046T6T9U/p1683187396031649
The text was updated successfully, but these errors were encountered: