Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEX_PATH transitivity #1423

Closed
stuhood opened this issue Aug 27, 2021 · 20 comments
Closed

PEX_PATH transitivity #1423

stuhood opened this issue Aug 27, 2021 · 20 comments

Comments

@stuhood
Copy link

stuhood commented Aug 27, 2021

When building a graph of PEXes, each of which contains distributions for a single requirement (*mostly: disregard the pluggy_zipp_importlib-metadata_pytest_attrs cluster), it appears that the embedded pex_path in the PEX-INFO is not consumed. I'm wondering whether:

  1. these failures are expected
  2. providing a flattened transitive PEX_PATH at runtime instead of embedding the direct pex_path deps at build time is likely to be able to allow me to get the behavior I want

As an example: a single requirement PEX containing humbug, with a pex_path containing its requirement PEXes might have PEX-INFO that looks like:

{
  "always_write_cache": false,
  "build_properties": {
    "class": "CPython",
    "pex_version": "2.1.44",
    "platform": "macosx_10_16_x86_64",
    "version": [
      3,
      7,
      7
    ]
  },
  "code_hash": "da39a3ee5e6b4b0d3255bfef95601890afd80709",
  "distributions": {
    "humbug-0.2.6-py3-none-any.whl": "e772468167e997e051e8c20d71551a11bba19f26"
  },
  "emit_warnings": false,
  "ignore_errors": false,
  "includes_tools": false,
  "inherit_path": "false",
  "interpreter_constraints": [],
  "pex_hash": "53fc35393a34baa9b5665c00dc19f27a02a3ea07",
  "pex_path": "__reqs/requests.pex:__reqs/types-requests.pex:__reqs/setuptools.pex:__reqs/urllib3.pex:__reqs/certifi.pex:__reqs/idna.pex:__reqs/charset-normalizer.pex:__reqs/pluggy_zipp_importlib-metadata_pytest_attrs.pex:__reqs/typing-extensions.pex:__reqs/packaging.pex:__reqs/iniconfig.pex:__reqs/py.pex:__reqs/toml.pex:__reqs/six.pex:__reqs/pyparsing.pex",
  "requirements": [
    "humbug"
  ],
  "strip_pex_env": true,
  "unzip": false,
  "venv": false,
  "venv_bin_path": "false",
  "venv_copies": false,
  "zip_safe": false
}

But consuming it will fail to find the requests distribution embedded in the __reqs/requests.pex PEX:

Below either PexBuilder.set_script or bin.pex.seed_cache:

  ...
  File ".deps/pex-2.1.44-py2.py3-none-any.whl/pex/pex.py", line 118, in resolve
    for dist in env.resolve():
  File ".deps/pex-2.1.44-py2.py3-none-any.whl/pex/environment.py", line 608, in resolve
    self._resolved_dists = self.resolve_dists(all_reqs)
  File ".deps/pex-2.1.44-py2.py3-none-any.whl/pex/environment.py", line 695, in resolve_dists
    "{items}".format(pex=self._pex, platform=self._platform, items="\n".join(items))
pex.environment.ResolveError: Failed to resolve requirements from PEX environment @ /private/var/folders/bg/_r10hqp14kjcpv68yzdk5svc0000gn/T/process-executioneHVQxG/__reqs/humbug.pex.
Needed macosx_10_16_x86_64-cp-37-cp37m compatible dependencies for:
 1: requests
    Required by:
      humbug 0.2.6
    But this pex had no 'requests' distributions.
@jsirois
Copy link
Member

jsirois commented Aug 27, 2021

Both 1 and 2 are correct. Pex could be modified to walk a graph of pex_path metadata in PEXEnvironment, but - IIUC that feature would just save you a line or two of code implenting a hack that will be replaced anyhow if the hack bears fruit. Is that about right?

@jsirois
Copy link
Member

jsirois commented Aug 27, 2021

I actually may not understand what's going on here. I'll take a look tomorrow. It would be useful if you could sketch this all out 1st with just Pex command lines, but I can do that too to create a repro case that doesn't have Pants code entangled / details missing.

@stuhood
Copy link
Author

stuhood commented Aug 27, 2021

It would be useful if you could sketch this all out 1st with just Pex command lines, but I can do that too to create a repro case that doesn't have Pants code entangled / details missing.

Sure, sorry.

Here is a small repro (which, confusingly, uses requests as the root, rather than as the missing thing):

$ for i in certifi charset_normalizer idna urllib3; do ~/src/venvs/pex-2.1.45/bin/pex "${i}" -o "${i}.pex"; done
$ ~/src/venvs/pex-2.1.45/bin/pex requests --no-transitive --pex-path=certifi.pex:charset_normalizer.pex:idna.pex:urllib3.pex -o requests.pex
$ ./requests.pex
Traceback (most recent call last):
  File ".../requests.pex/.bootstrap/pex/pex.py", line 482, in execute
  File ".../requests.pex/.bootstrap/pex/pex.py", line 139, in activate
  File ".../requests.pex/.bootstrap/pex/pex.py", line 126, in _activate
  File ".../requests.pex/.bootstrap/pex/environment.py", line 428, in activate
  File ".../requests.pex/.bootstrap/pex/environment.py", line 784, in _activate
  File ".../requests.pex/.bootstrap/pex/environment.py", line 608, in resolve
  File ".../requests.pex/.bootstrap/pex/environment.py", line 695, in resolve_dists
pex.environment.ResolveError: Failed to resolve requirements from PEX environment @ .../requests.pex.
Needed macosx_10_16_x86_64-cp-36-cp36m compatible dependencies for:
 1: urllib3<1.27,>=1.21.1
    Required by:
      requests 2.26.0
    But this pex had no 'urllib3' distributions.
 2: certifi>=2017.4.17
    Required by:
      requests 2.26.0
    But this pex had no 'certifi' distributions.
 3: charset-normalizer~=2.0.0; python_version >= "3"
    Required by:
      requests 2.26.0
    But this pex had no 'charset-normalizer' distributions.
 4: idna<4,>=2.5; python_version >= "3"
    Required by:
      requests 2.26.0
    But this pex had no 'idna' distributions.

But, the following does work (question 2 from the opening: sorry, should have done the research) (EDIT: Does not work: see below.):

$ PEX_PATH=certifi.pex:charset_normalizer.pex:idna.pex:urllib3.pex ./requests.pex
Python 3.6.10 (default, May 21 2020, 18:35:53)
[GCC 4.2.1 Compatible Apple LLVM 11.0.3 (clang-1103.0.32.59)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>>

@jsirois
Copy link
Member

jsirois commented Aug 27, 2021

Thanks for the repro case.

I don't repro the PEX_PATH env var success:

$ python -mvenv pex.venv
$ pex.venv/bin/pip install pex==2.1.45
$ for i in certifi charset_normalizer idna urllib3; do pex.venv/bin/pex "${i}" -o "${i}.pex"; done
$ pex.venv/bin/pex requests --no-transitive --pex-path=certifi.pex:charset_normalizer.pex:idna.pex:urllib3.pex -o requests.pex
$ ./requests.pex 
Traceback (most recent call last):
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/pex.py", line 482, in execute
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/pex.py", line 139, in activate
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/pex.py", line 126, in _activate
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/environment.py", line 428, in activate
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/environment.py", line 784, in _activate
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/environment.py", line 608, in resolve
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/environment.py", line 692, in resolve_dists
pex.environment.ResolveError: Failed to resolve requirements from PEX environment @ /tmp/issues-1423/requests.pex.
Needed manylinux_2_33_x86_64-cp-39-cp39 compatible dependencies for:
 1: urllib3<1.27,>=1.21.1
    Required by:
      requests 2.26.0
    But this pex had no 'urllib3' distributions.
 2: certifi>=2017.4.17
    Required by:
      requests 2.26.0
    But this pex had no 'certifi' distributions.
 3: charset-normalizer~=2.0.0; python_version >= "3"
    Required by:
      requests 2.26.0
    But this pex had no 'charset-normalizer' distributions.
 4: idna<4,>=2.5; python_version >= "3"
    Required by:
      requests 2.26.0
    But this pex had no 'idna' distributions.
$ PEX_PATH=certifi.pex:charset_normalizer.pex:idna.pex:urllib3.pex ./requests.pex -c 'import certifi, idna, urllib3, charset_normalizer, requests'
Traceback (most recent call last):
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/pex.py", line 482, in execute
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/pex.py", line 139, in activate
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/pex.py", line 126, in _activate
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/environment.py", line 428, in activate
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/environment.py", line 784, in _activate
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/environment.py", line 608, in resolve
  File "/tmp/issues-1423/requests.pex/.bootstrap/pex/environment.py", line 692, in resolve_dists
pex.environment.ResolveError: Failed to resolve requirements from PEX environment @ /tmp/issues-1423/requests.pex.
Needed manylinux_2_33_x86_64-cp-39-cp39 compatible dependencies for:
 1: urllib3<1.27,>=1.21.1
    Required by:
      requests 2.26.0
    But this pex had no 'urllib3' distributions.
 2: certifi>=2017.4.17
    Required by:
      requests 2.26.0
    But this pex had no 'certifi' distributions.
 3: charset-normalizer~=2.0.0; python_version >= "3"
    Required by:
      requests 2.26.0
    But this pex had no 'charset-normalizer' distributions.
 4: idna<4,>=2.5; python_version >= "3"
    Required by:
      requests 2.26.0
    But this pex had no 'idna' distributions.

Looking at the code, this makes sense. The Pex runtime does create a PEXEnvironment for the PEX file in question and each element of the pex path, but each of thse environments is activated in isolation, one at a time. Activation involves resolving all needed dists for the current runtime interpreter from within the hermetic PEX in question; I.E.: Even during activation, each PEX is isolated and does not see the others. The sys.path is only extended to include each element on the pex path after that element successfully resolves on its own.

So, there should probably be expanded CLI help here:

--pex-path PEX_PATH   A colon separated list of other pex files to merge into the runtime environment. (default: None)

That hints at all this but by no means spells it out clearly.

As to what you're trying to do here - I think supporting a --spread mode is the way to go and is not much more code than trying to get PEX_PATH to 1st gather all dists from all PEXes and then doing 1 resolve against that full set. Outlined here: #1424

@stuhood
Copy link
Author

stuhood commented Aug 27, 2021

I don't repro the PEX_PATH env var success:

Shoot... you're right. Looks like I accidentally ran that case without --no-transitive.

As to what you're trying to do here - I think supporting a --spread mode is the way to go and is not much more code than trying to get PEX_PATH to 1st gather all dists from all PEXes and then doing 1 resolve against that full set. Outlined here: #1424

Got it: thanks. Will respond there.

@jsirois
Copy link
Member

jsirois commented Nov 25, 2021

Closing since this was answered and superseded by #1424 which has since shipped and been used to good effect IIUC in Pants.

@stuhood
Copy link
Author

stuhood commented Dec 22, 2021

FWIW: recursive composition of PEXes (as alluded to here and described in the second half of this comment) came up again in the context of pantsbuild/pants#10864. "subset per (test) root" has been workable with optimizations like #1534, but it seems clear that also subsetting for inner nodes (i.e. per file, or even per tens of files) won't scale.

I'm not sure that it will be a priority any time soon, but I think that it would be worth re-opening this issue at some point such that we can externally/recursively compose PEXes via the PEX_PATH.

@jsirois
Copy link
Member

jsirois commented Dec 22, 2021

I'm not sure that it will be a priority any time soon, but I think that it would be worth re-opening this issue at some point such that we can externally/recursively compose PEXes via the PEX_PATH.

Why are you using Pex at all at that point? I feel like I'm missing something. The remaining slowness in subsetting is in logic you'll need to implement outside of Pex unless you know some magic I don't. You'll need to be able to evaluate the booleans distribution in requirement and Marker.evaluate(...) too to compose the right subset.

@jsirois
Copy link
Member

jsirois commented Dec 22, 2021

Maybe simpler to think about. What advantage is it you think you have and can leverage in Pants that Pex doesn't have?

@stuhood
Copy link
Author

stuhood commented Dec 27, 2021

What advantage is it you think you have and can leverage in Pants that Pex doesn't have?

I think that the primary difference is that subsetting requires creating and then tearing down a sandbox for the subsetting. We should probably apply immutable_inputs there to reduce the overhead of materializing the repository PEX though.

Why are you using Pex at all at that point? I feel like I'm missing something.

AFAIK, spread PEXes are composed of "zipped installed wheels", which are sortof like eggs, but not exactly. We could re-invent that format externally to PEX, but.

If the relevant subset of the "zipped installed wheels" were placed directly on the python path (packed without a venv, I think?), there would be no external per-invoke overhead: just the overhead of Python loading modules from zipfiles on the pythonpath. That's not as optimized as a warm venv (as shown on #1438), but is much faster for the cold case, which seems like the only viable way to invoke a python interpreter once per file with transitive dependencies (where most invokes are expected to succeed, and then not need to re-run to ever experience the warm case).

@jsirois
Copy link
Member

jsirois commented Dec 27, 2021

You've failed to address my point about the remaining overhead in subsetting. That's the same overhead Pants would need to incur using PEX_PATH, namely calculating the subset of PEXes to compose. In my profiling case that calculation took the remaining 2s of what was initially a 4s subset operation. I killed 2s of that by not re-hashing distributions.

I think that the primary difference is that subsetting requires creating and then tearing down a sandbox for the subsetting.

Ok, so Pants has a slowness problem there but it sounds like it also has an undeployed workaround for that bit.

AFAIK, spread PEXes are composed of "zipped installed wheels", which are sortof like eggs, but not exactly. We could re-invent that format externally to PEX, but.

If the relevant subset of the "zipped installed wheels" were placed directly on the python path (packed without a venv, I think?), there would be no external per-invoke overhead: just the overhead of Python loading modules from zipfiles on the pythonpath.

The change in Pex to support creating a venv by using symlinks solves any perf issue in creating venvs for large dists IIUC.

@jsirois
Copy link
Member

jsirois commented Dec 27, 2021

Going along with your optimal cold case tack: in order to avoid the Xms it takes to build a venv (I'll circle back and fill in this number), you need to be using a scrubbed python sys.path. The only way to do this performantly (Pex's implementation of this takes O(100ms) IIRC) in a way that works across various Python distributions (the obvious python -S doesn't work for all of these unfortunately) that I'm aware of is to 1st build an empty venv for the desired interpreter and then invoke that interpreter with a PYTHONPATH that includes the calculated subset of required pre-installed distributions. So that can be done and clearly the number of unique needed Python interpreters is << the number of unique subsets in the vast majority of cases. So Pex could do that as a variant of the --venv mode. That variant would run into problems with badly behaved ns-packages & certain libs with single site-packages does like we recently hit with Pants own malformed ns packages and Pylint respectively; so it could not be used universally.

@jsirois
Copy link
Member

jsirois commented Dec 28, 2021

Finally, re the cold case, we're still talking ineffective since the 2s subset calculation time is still there and the venv creation time is less than that (closer to 1s max IIRC). So if we really want to support fast cold subsetting, the 0th order problem is the subset calculation speed.

@jsirois
Copy link
Member

jsirois commented Dec 28, 2021

Besides switching to a new language to implement the PEX CLI in, the only obvious thing that would work in a fairly obvious way to speed up subset calculation would be to avoid the expensive parts, which broke down into parsing requirements in 2 places. The easiest way to avoid that is to do it just once. That means a Pex daemon. Pants doesn't support those generically yet though; so either it would need to grow persistent worker support paired with a Pex daemon mode, or Pex would need to have a transparent daemon mode per-PEX_ROOT that Pants could use without doing anything except passing a flag.

@jsirois
Copy link
Member

jsirois commented Dec 28, 2021

A simpler way to handle subsetting perf improvement is probably to introduce a new --seed-resolve option to go along with venv --seeding. In seed mode you know something critical, the seeding is for local execution using the local interpreters resolved for this PEX being created. In this mode each local interpreter in-play can evaluate markers, etc once and output a data structure something like:

[
  {
    "interpreter": "...",
    "requirements": {
      "foo>1.2": "fecdbc8b63360ced5e6d0ab2819586145057a47c/foo-1.3--py2.py3-none-any.whl"
    },
    "dependencies": {
      "fecdbc8b63360ced5e6d0ab2819586145057a47c/foo-1.3--py2.py3-none-any.whl": [
        "fecaabff0e1528da693cb73c3e2b4e50af74c045/isort-5.6.4-py3-none-any.whl",
        "fe3c088f452c335d15e29a3bd29cb550feeba62e/google_auth-1.35.0-py2.py3-none-any.whl"
      ],
      "fecaabff0e1528da693cb73c3e2b4e50af74c045/isort-5.6.4-py3-none-any.whl": [],
      "fe3c088f452c335d15e29a3bd29cb550feeba62e/google_auth-1.35.0-py2.py3-none-any.whl": [
        "fecaabff0e1528da693cb73c3e2b4e50af74c045/isort-5.6.4-py3-none-any.whl"
      ]
    }
  }
]

That data could be loaded and parsed quickly and used to calculate a subset ~as fast as possible for any of the local interpreters used to build the PEX. If --platforms were included, it would not work for those - no entry would be output. Pants use-case though is exactly only local interpreters. As such, I think this additional seed mode makes sense both for Pex standalone and it would satisfy Pants need to cut down on subsetting time. That time is now amortized up-front in 1 ~2s step (sticking to the 4s subset test I've been referring to) to output this file. The loading of the file and forming of a subset from it should be roughly O(10ms) not having written any code yet.

This has a distinct advantage over PEX_PATH transitivity since PEX_PATH requires running all the PEX resolve code / environment marker evaluation, etc on each boot. The approach above allows all that to be skipped is a venv is assembled from the subset, or a PYTHONPATH is constructed with all the caveats mentioned on that not working for bad ns-pkg dists out in the wild and dists like Pylint.

@jsirois
Copy link
Member

jsirois commented Dec 28, 2021

@stuhood this is all a very complicated set of perf / compatibility trade offs. I think seeding is the last key to making subsets, whether materialized in venvs or PYTHONPATH or however, fast as possible in Python. That said, this could all use an issue to explore the goal instead of the mechanism - namely fast-as-possible subsetting performance for local use PEXes. If that makes sense to you, I'll break one out and tame this mess there with some pared down perf # examples of todays slow bits and the ideas for solving these wholistically - i.e.: how to get a subset calculated fast, how to assemble a subset fast robustly, how to assemble a subset fast less robustly, and how to scrub sys.path performantly. All the bits needed to actually create and execute a subset with the standard PEX guarantees of hermeticity.

@stuhood
Copy link
Author

stuhood commented Jan 5, 2022

@stuhood this is all a very complicated set of perf / compatibility trade offs. I think seeding is the last key to making subsets, whether materialized in venvs or PYTHONPATH or however, fast as possible in Python. That said, this could all use an issue to explore the goal instead of the mechanism - namely fast-as-possible subsetting performance for local use PEXes.

Yea. The goal in this case is probably "being able to dynamically create a venv that is a subset of a lockfile as fast as possible in order to support fine-grained invokes of tools (such as recursive compilation with mypy, or even faster pytest usage)" (note the lockfile aspect, since we hope to enable them by default in Pants 2.10.x).

A simpler way to handle subsetting perf improvement is probably to introduce a new --seed-resolve option to go along with venv --seeding. In seed mode you know something critical, the seeding is for local execution using the local interpreters resolved for this PEX being created.

How different is this from the proposed lockfile format? But yea, this seemingly amounts to exporting the fully resolved graph.

If that makes sense to you, I'll break one out and tame this mess there with some pared down perf # examples of todays slow bits and the ideas for solving these wholistically - i.e.: how to get a subset calculated fast, how to assemble a subset fast robustly, how to assemble a subset fast less robustly, and how to scrub sys.path performantly. All the bits needed to actually create and execute a subset with the standard PEX guarantees of hermeticity.

It does make sense, thanks. I'm doing some triage of Pants performance issues this week in order to figure out where we should invest here: I don't think the relative priority of this issue as high right now as 1) finishing lockfile support, 2) finishing enabling the Pants rust client by default, but maybe a design sketch in another ticket would be convincing enough to prioritize.

@stuhood
Copy link
Author

stuhood commented Jan 6, 2022

A simpler way to handle subsetting perf improvement is probably to introduce a new --seed-resolve option to go along with venv --seeding. In seed mode you know something critical, the seeding is for local execution using the local interpreters resolved for this PEX being created.

How different is this from the proposed lockfile format? But yea, this seemingly amounts to exporting the fully resolved graph.

Relatedly: how likely is it that producing the "--seed-resolve" (or lockfile) could skip actually building/installing any wheels, such that a consumer of the --seed-resolve could lazily build/install wheels (once, atomically inside the pex_root) at consumption time?

That would 1) speed up changes to the lockfile, and 2) implicitly parallelize the work of actually building wheels into consumers of the lockfile.

@jsirois
Copy link
Member

jsirois commented Jan 7, 2022

How different is this from the proposed lockfile format?

Alot. That format includes unevaluated environment markers. The whole point here is those need to be evaluated ahead of time once to save time later. That can only be done for interpreters-in-hand. In the --seed scenario, that is exactly what we happen to have.

The lockfile conflation is definitely off. For lockfile generation there is already no building of wheels or installing of them. The lockfile generation is implemented using only pip download .... That - necessarily - only builds dist metadata (via setup.py egg_info old school or other methods in PEP-517) for sdists. No wheels are built, no dists are installed. In fact, for multi-platform locks (#1402) which is apparently what most people will use, only 1 set of dists is downloaded. All other dists available are inferred from that download's pip verbose logs.

Relatedly: how likely is it that producing the "--seed-resolve" (or lockfile) could skip actually building/installing any wheels, such that a consumer of the --seed-resolve could lazily build/install wheels (once, atomically inside the pex_root) at consumption time?

So, I'm not sure what angle you've got here. Are you trying to milk parallelism for those cases where the Pex --jobs are configured lower than the Pants parallelism? Pex already builds and installs in parallel using --jobs; so its true this is eager and not lazy, but afaict its never wasted work. Pants will always need the built / installed wheels.

Answering the question though, right now wheel metadata is what is used to calculate the active dependency graph for the current interpreter given a set of requirement roots. I can't see this sanely budging since that would mean Pex would need to learn how to extract metadata from sdists; ie running setup.py egg_info old school or PEP-517 new-school (which requires falling back to just plain building the wheel since metadata-only build methods are optional in PEP-517). So, wheels required, installing them is not. That said, installing the wheels is what is required to actually build any sort of PEX today. To take that step out (again so why?... so Pants could eek out more parallelism for the --jobs < Pants parallelism case?) would require the Pex runtime learning how to install wheels. That either means adding the full bulk of Pip to the runtime (it's only used in the build time tool today and is 2.3MB compressed vs the rest of the runtime which is ~400KB compressed) or else writing a bunch of code.

@stuhood
Copy link
Author

stuhood commented Jan 7, 2022

but afaict its never wasted work. Pants will always need the built / installed wheels.

You're right if you're running all of the tests in a repository (the CI case). But if we're going to be subsetting a whole-repo resolve, the effect right now is that all wheels for the entire resolve need to be built in order to run a single test, even if that single test only needs a small subset of the resolve.

Answering the question though, right now wheel metadata is what is used to calculate the active dependency graph for the current interpreter given a set of requirement roots. I can't see this sanely budging since that would mean Pex would need to learn how to extract metadata from sdists; ie running setup.py egg_info old school or PEP-517 new-school (which requires falling back to just plain building the wheel since metadata-only build methods are optional in PEP-517). So, wheels required, installing them is not. That said, installing the wheels is what is required to actually build any sort of PEX today. To take that step out (again so why?... so Pants could eek out more parallelism for the --jobs < Pants parallelism case?) would require the Pex runtime learning how to install wheels. That either means adding the full bulk of Pip to the runtime (it's only used in the build time tool today and is 2.3MB compressed vs the rest of the runtime which is ~400KB compressed) or else writing a bunch of code.

I don't think that I mean that wheel building/installing should be deferred until runtime, per-se... I think that I had been thinking that in our current world where we 1) fully execute/build-wheels for a large resolve, 2) subset that resolve into a PEX, 3) build a venv from the PEX... then if we moved the wheel building from step 1 to step 2, then we'd introduce good laziness for the "I'm running a single test" case.

But if step 2 were to go away, and we were instead directly building a venv from a lockfile/seed-resolve, then lazily building the wheels would mean essentially ... resolving the subset directly into a venv from the lockfile/seed-resolve, I suppose. You're right that that is odd... it would essentially be asking PEX the tool not to actually bother to create a PEX file, and to just create a venv. It seems very close to the "Installer" role of PEP-665, but with the (added?) ability to install only a subset of the lockfile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants