Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing "activation" time from PEX #1115

Closed
stuhood opened this issue Nov 25, 2020 · 19 comments · Fixed by #1153
Closed

Removing "activation" time from PEX #1115

stuhood opened this issue Nov 25, 2020 · 19 comments · Fixed by #1153
Assignees

Comments

@stuhood
Copy link

stuhood commented Nov 25, 2020

As shown in #930, in some cases the activation of a PEX takes 50% of the total runtime, primarily on the task of "resolving" the dependencies to use within that run. This resolution is necessary for cross-platform and portable PEX files which might contain multiple copies of certain wheels.

Two potential ways to eliminate this time might be:

  1. Adding explicit support for building non-portable PEX files. Resolution is not necessary if the PEX is only meant to "work on my machine" (frequently the case in Pants). When compared to a virtualenv, a PEX would still have the advantage of being relocatable on the filesystem.
  2. Preserving the resolutions computed at PEX construction time (in the PEX-INFO, perhaps) for use verbatim at runtime. This would involve doing a fuzzy dictionary lookup of the target platforms for the PEX to determine which set of contained wheels to use.

In both cases, one challenge would be giving good error messages if a PEX had been moved to an incompatible platform. If the PEX explicitly embedded the platform(s) it was built to target, it could quickly fail if was invoked on an incompatible platform (without consulting its list of wheels).

@jsirois
Copy link
Member

jsirois commented Nov 25, 2020

PEX takes 50% of the total runtime, primarily on the task of "resolving" the dependencies to use within that run.

How confident are you of your interpretation of the speedscope? FWICT > 50% of "resolving" is importing which was the initial focus of #930. Importing we cannot get around multiplatform PEX or not except by eliminating expensive imports from the bootstrap.

@stuhood
Copy link
Author

stuhood commented Nov 25, 2020

Somewhat confident. Were you able to get it to load?

The activate phase of the profile does not actually involve any importing, afaict: the majority of the time under activate is spent in resolve->...->_compute_dependencies->...->parse_requirement->.... You can also see a large amount of time in _find_and_load, but that is independent of the time spent in activate.

I used py-spy to capture the profile, like:

py-spy record -f speedscope -o pytest.prof -- python3.5 ${PEXFILE}

@jsirois
Copy link
Member

jsirois commented Nov 25, 2020

This is what I'm looking at:

image

@stuhood
Copy link
Author

stuhood commented Nov 25, 2020

@jsirois : That's not the callstack: i.e. activate doesn't call _find_and_load, which is what (primarily) calls import_module.

In this screenshot (a flamegraph), the left 25% of the script is under _find_and_load, the middle 50% is under activate, and the right 25% is under pytest itself.
Screen Shot 2020-11-24 at 5 01 51 PM

Also, speedscope overrides the "Find-in-Page" implementation in a really useful way, and so "Find-in-Page" for import_module shows that it's all outside of activate.

@jsirois
Copy link
Member

jsirois commented Nov 25, 2020

Gotcha. So there are 11 instances of _activate (/home/vagrant/slow-pytest/process-execution3RBeWT/pytest.pex/.bootstrap/pex/environment.py:439) in that section. That function is called once per PEXEnvironment activation and that implies your example has 11 PEXes conjoined via PEX_PATH. Is that true? ... Ah, nm - the line changes.

@jsirois
Copy link
Member

jsirois commented Nov 25, 2020

Ok, so in this example 27% of the total runtime is spent resolving amongst 3 PEX files conjoined via PEX_PATH.

@jsirois
Copy link
Member

jsirois commented Nov 25, 2020

The resolve is delegated to pkg_resources.{Environment,WorkingSet} which do more than we need. We just need:
https://github.com/pantsbuild/pex/blob/16a4b3a4980008fe47a509afc3b24381a6649a95/pex/environment.py#L232-L255

And that looks like it takes 5% of the runtime only. So this means ditching PEXEnvironment inheritance from Environment / use of WorkingSet in favor of just ripping through all distributions in the PEX and evaluating them with can_add directly before adding them to sys.path.

@jsirois
Copy link
Member

jsirois commented Dec 2, 2020

Another tack would be to implement running PEXes as normal applications as proposed in #962. In that approach there is a 1 time venv setup cost (unmeasured cost at this point) at which point every execution going forward re-execs into the venv (PEX_ROOT/venvs/...) in the same manner as --unzip today currently re-execs into PEX_ROOT/unzipped_pexes/...). The --unzip re-exec currently costs ~40ms on my machine (within a few ms of the startup overhead of the python interpreter itself). In this approach runs 2 through N should have ~40ms of overhead over a pure venv, which would be the natural baseline to measure Pex performance against.

I have venv creation implemented as a runtime tool that can be manually run (build PEX file using new --include-tools option, run PEX file with PEX_TOOLS=1 <pex file> venv <venv dir>). I'll get that and other Pex tools out for review - which are useful on their own - and then try adding an --execution-mode/PEX_EXECUTION_MODE {venv|unzipped|pex} option to get timings on this.

@stuhood
Copy link
Author

stuhood commented Dec 2, 2020

In that approach there is a 1 time venv setup cost (unmeasured cost at this point) at which point every execution going forward re-execs into the venv (PEX_ROOT/venvs/...) in the same manner as --unzip today currently re-execs into PEX_ROOT/unzipped_pexes/...).

Interesting!

A fairly important benchmark is the one in the description: the two pexes are 1) pytest, 2) user requirements, with loose sources sitting alongside. The loose sources are edited the most, and the two pexes should be relatively more stable, and thus possible to reuse. I think that it is also entirely possibly for that to be one pex, because I don't think we gain much benefit from it being two: just from the sources being loose.

If we assume that the goal of a pex is always to run it multiple multiple times, then "creating pexes is fast" is a lot less important than "running pexes is fast". So particularly if more of the preparation of the venv can be frontloaded to construction time (which was the thrust of my "move the calculation of what we will be activating from pex runtime to pex construction time" comments in the description), then that sounds interesting.

@jsirois
Copy link
Member

jsirois commented Dec 2, 2020

So particularly if more of the preparation of the venv can be frontloaded to construction time

N.B.: Even if the venv construction was only done on demand at runtime, Pants could PEX_INTERPRETER=1 ./the.pex -c '' in a "construction time" rule to pre-create the venv without running the user code in the PEX file.

@stuhood
Copy link
Author

stuhood commented Dec 2, 2020

So particularly if more of the preparation of the venv can be frontloaded to construction time

N.B.: Even if the venv construction was only done on demand at runtime, Pants could PEX_INTERPRETER=1 ./the.pex -c '' in a "construction time" rule to pre-create the venv without running the user code in the PEX file.

Yea, true. But it would be great if even the "first run" of a PEX was faster. Because the benefit of the extracted/reusable venv can only be realized if you have support for mutable caches (unless the venv itself is relocatable?)

@jsirois
Copy link
Member

jsirois commented Dec 2, 2020

The venv is relocatable as a whole and its output path is also controllable. My current experiment that gets venv working has this CLI syntax: PEX_TOOLS=1 <pex file> venv <venv dir>. So that too could be used to pre-seed to an arbitrary location (the sandbox to be CASed).

@jsirois jsirois self-assigned this Dec 5, 2020
jsirois added a commit to jsirois/pex that referenced this issue Dec 7, 2020
Add a new `--include-tools` option to include any pex.tools in generated
PEX files. These tools are activated by running PEX files with
PEX_TOOLS=1. The `Info` tool seeds the tool set and simply dumps the
effective PEX-INFO for the given PEX.

Work towards pex-tool#962 and pex-tool#1115
@stuhood
Copy link
Author

stuhood commented Dec 7, 2020

Rewinding a bit to discuss where this is headed: the goal right now is PEX_TOOLS=1 ./$pex venv to support extracting a PEX into a venv? It might be good to run through a list of different potential usecases to ensure that this is maximally useful.

In particular, one question I have is: in which cases would someone want to:

  1. build-a-pex + run-the-pex
  2. build-a-pex + extract-the-pex-to-a-venv,
  3. directly build a venv

Cases two and three are particularly interesting. I can definitely see this being useful from a normalization perspective (Pants uses PEX internally, but can export a venv), but are there benefits to two over three for the pytest usecase?

@jsirois
Copy link
Member

jsirois commented Dec 8, 2020

@kwlzn opined on item 1 here: #962 (comment)

I suspect an inverted flag to indicate e.g. "--force-zip" (or "--zip-safe=False" becoming the default) might be nice to leave around for tight-quartered execution of large pex envs where expansive space consumption may not be desirable (like a pex that contains a large but otherwise zip-safe library executing on a PySpark worker, etc) - otherwise, this sounds great to me as a default mode.

pex execution overhead is a major UX issue for us particularly with O(GB) pex envs for DS/ML - and for e.g. local tool use cases.

Item 2 would be immediately useful to Pants clearly.

Item 3 is not a thing for Pex since Pex fundamentally supports two fetaures:

  1. Ship a zipapp
  2. Do so for potentially multiple platforms

There is already a tool for item 3 and that's python -mvenv && ...pip install. If Pants wan't to do that it has been free to all along.

You left out item 4, which Is build-a-pex-that-self-extracts-to-a-venv + run-a-pex. That will build upon item 2 and close #962.

jsirois added a commit that referenced this issue Dec 8, 2020
Add a new `--include-tools` option to include any pex.tools in generated
PEX files. These tools are activated by running PEX files with
PEX_TOOLS=1. The `Info` tool seeds the tool set and simply dumps the
effective PEX-INFO for the given PEX.

Work towards #962 and #1115
@stuhood
Copy link
Author

stuhood commented Dec 8, 2020

There is already a tool for item 3 and that's python -mvenv && ...pip install. If Pants wan't to do that it has been free to all along.

Would you recommend that Pants do that for pytest? Or is item 2 a better fit for that usecase?

You left out item 4, which Is build-a-pex-that-self-extracts-to-a-venv + run-a-pex. That will build upon item 2 and close #/962.

I was assuming that that was sufficiently fast, it would be the default implementation of item 1, so I didn't include it as a separate item. Should it be?

@jsirois
Copy link
Member

jsirois commented Dec 8, 2020

Would you recommend that Pants do that for pytest? Or is item 2 a better fit for that usecase?

I'm not sure yet, but my reccomendation is not really relevant. The only relevant thing is the performance comparison which we'll have shortly.

I was assuming that that was sufficiently fast, it would be the default implementation of item 1, so I didn't include it as a separate item. Should it be?

Again not sure yet. Need timings which we'll have shortly. The only timings I have so far are for case 2 (build-a-pex + extract-the-pex-to-a-venv + run the venv). That case is == to raw venv speed +/- 1ms (noise) for runs 2+. For run 1, the summed time of build-a-pex + extract-the-pex-to-a-venv + run the venv, its 70 to 140 ms slower in my current test cases on my machine.

Bear in mind - this tools / venv approach is needed without regard for this issue - see #962, but there are other problems caused by PEX's custom venv solution that the tools / venv will fix which will be a win for some users who can't use PEX at all to bundle their app today. IOW Pants perf concerns have no trumping influence on the need for this approach, only for prioritization of shared resources.

@jsirois
Copy link
Member

jsirois commented Dec 8, 2020

OK, for a 47MB pex with 114 distributions just activating the PEX timings:

Old style

  • Cold cache:
$ time PEX_INTERPRETER=1 ./tc.pex -c ''

real	0m3.695s
user	0m3.024s
sys	0m0.628s
  • Warm cache:
$ time PEX_INTERPRETER=1 ./tc.pex -c ''

real	0m1.366s
user	0m1.238s
sys	0m0.111s

New style:

  • Cold cache:
$ time python -mpex.tools ./tc.pex venv --collisions-ok tc.venv
/home/jsirois/dev/pantsbuild/jsirois-pex/pex/tools/commands/venv.py:62: PEXWarning: Failed to overwrite /home/jsirois/dev/pantsbuild/jsirois-pex/tc.venv/bin/jsondiff with /home/jsirois/.pex/installed_wheels/dd30b02f25723bd5700c4a247dbea045be038b07/jsondiff-1.2.0-py3-none-any.whl/bin/jsondiff: [Errno 17] File exists: '/home/jsirois/.pex/installed_wheels/dd30b02f25723bd5700c4a247dbea045be038b07/jsondiff-1.2.0-py3-none-any.whl/bin/jsondiff' -> '/home/jsirois/dev/pantsbuild/jsirois-pex/tc.venv/bin/jsondiff'
  pex_warnings.warn(

real	0m1.984s
user	0m1.686s
sys	0m0.289s
$ time sh -c "echo '' | ./tc.venv/pex"
Python 3.9.0 (default, Oct  7 2020, 23:09:01) 
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> >>> 
now exiting InteractiveConsole...

real	0m0.036s
user	0m0.026s
sys	0m0.009s
  • Warm cache:
$ time sh -c "echo '' | ./tc.venv/pex"
Python 3.9.0 (default, Oct  7 2020, 23:09:01) 
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> >>> 
now exiting InteractiveConsole...

real	0m0.035s
user	0m0.027s
sys	0m0.008s

@stuhood
Copy link
Author

stuhood commented Dec 8, 2020

Really, really awesome stuff.

jsirois added a commit to jsirois/pex that referenced this issue Dec 11, 2020
This fixes binary canonicalization to handle virtual environments
created with virtualenv instead of pyvenv. It also adds support for
resolving the base interpreter used to build a virtual environment.

The ability to resolve a virtual environment intepreter will be used to
fix pex-tool#1031 where virtual environments created with
`--system-site-packages` leak those packages through as regular sys.path
entries otherwise undetectable by PEX.

Work towards pex-tool#962 and pex-tool#1115.
This was referenced Dec 11, 2020
jsirois added a commit that referenced this issue Dec 11, 2020
This fixes binary canonicalization to handle virtual environments
created with virtualenv instead of pyvenv. It also adds support for
resolving the base interpreter used to build a virtual environment.

The ability to resolve a virtual environment intepreter will be used to
fix #1031 where virtual environments created with
`--system-site-packages` leak those packages through as regular sys.path
entries otherwise undetectable by PEX.

Work towards #962 and #1115.
jsirois added a commit that referenced this issue Dec 14, 2020
Add a `venv` tool to create a virtual environment from a PEX file. The
virtual environment is seeded with just the PEX user code and
distributions applicable to the selected interpreter for the local
machine. The virtual environment does not have Pip installed by default
although that can be requested with `--pip`.

The virtual environment comes with a `__main__.py` at the root of the 
venv to emulate a loose pex that can be run with `python venv.dir` just
like a loose pex. This entry point supports all the behavior of the
original PEX file not related to interpreter selection, namely support
for PEX_SCRIPT, PEX_MODULE, PEX_INTERPRETER and PEX_EXTRA_SYS_PATH.

A sibling `pex` script is linked to `__main__.py` to provide the
maximum performance entrypoint that always avoids interpreter
re-execing and thus yields equivalent performance to a pure virtual
environment.

Work towards #962 and #1115.
This was referenced Dec 14, 2020
jsirois added a commit that referenced this issue Dec 24, 2020
The new --venv execution mode builds a PEX file that includes pex.tools
and extracts itself into a venv under PEX_ROOT upon 1st execution or any
execution that might select a diffrent interpreter than the default.

In order to speed up the local build and execute case, --seed mode is
added to seed the PEX_ROOT caches that will be used at runtime. This is
important for --venv mode since venv seeding depends on the selected
interpreter and one is already selected during the PEX file build
process.

Fixes #962
Fixes #1097
Fixes #1115
@jsirois
Copy link
Member

jsirois commented Apr 13, 2022

There was still overhead left in the PEX zip python bootstrap code running just enough to check if its venv was already present in the PEX_ROOT before re-exec'ing into it in the warm case. Pants worked around this with a fairly elaborate set of rules / shim scripts. The investigation in #1540 (comment) removes that overhead (collapses it from ~50ms to ~1ms) and will remove the need for all the Pants internal hacks.

jsirois added a commit that referenced this issue Apr 14, 2022
Allow users to choose `sh` as the boot mechanism for their PEXes. Not
only is `/bin/sh` probably more widely available than any given Python
shebang, but it's also much faster.

Relates to #1115 and #1540
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants