Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy import allows wheel to execute code on install. #13079

Closed
1 task done
calebbrown opened this issue Nov 14, 2024 · 11 comments · Fixed by #13085
Closed
1 task done

Lazy import allows wheel to execute code on install. #13079

calebbrown opened this issue Nov 14, 2024 · 11 comments · Fixed by #13085
Labels
type: bug A confirmed bug or unintended behavior type: security Has potential security implications
Milestone

Comments

@calebbrown
Copy link
Contributor

calebbrown commented Nov 14, 2024

Description

Versions of pip since 24.1b1 allow someone to run arbitrary code after a specially crafted bdist whl file is installed.

When installing wheel files pip does not constrain the directories the wheel contents are written into, except for checks that ensure traversal is only within the destination directories (e.g, purelib, platlib, data, etc) (see #4625)

This means a wheel is able to place files into existing modules that belong to other packages, such as pip, setuptools, etc.

If the installer lazily imports a module after the wheel is installed it is possible for the wheel to overwrite the module with its own code, which is then imported unintentionally by the installer.

For pip, this has been true since 24.1b1 when a change was introduced that dynamically loads the pip._internal.self_outdated_check module after running a command to check if pip needs upgrading.

Because this module is loaded after a package has been installed, a wheel can overwrite {purelib}/pip/_internal/self_outdated_check.py and have the code within it automatically executed when pip install {wheel} is run.

Expected behavior

This behavior is surprising. My understanding is that most Python users expect wheels can't run code during installation.

For example, the recent blog post on command jacking demonstrates this expectation:

Python wheels (.whl files) have become increasingly prevalent due to their performance benefits in package installation. However, they present a unique challenge for attackers

While both .tar.gz and .whl files may contain a setup.py file, .whl files don’t execute setup.py during installation. This characteristic has traditionally made it more difficult for attackers to achieve arbitrary code execution during the installation process when using .whl files.

That said, the wheel spec says nothing about security, or avoiding on-install code execution.

pip version

24.1b1

Python version

v3.11 later

OS

any

How to Reproduce

  1. Download wheelofdespair-0.0.1-py3-none-any.zip
  2. mv wheelofdespair-0.0.1-py3-none-any.zip wheelofdespair-0.0.1-py3-none-any.whl
  3. python3 -m venv env
  4. . env/bin/activate
  5. pip install --upgrade pip
  6. pip install wheelofdespair-0.0.1-py3-none-any.whl

Output

Collecting wheelofdespair
  Downloading wheelofdespair-0.0.1-py3-none-any.whl.metadata (201 bytes)
Downloading wheelofdespair-0.0.1-py3-none-any.whl (1.5 kB)
Installing collected packages: wheelofdespair
Successfully installed wheelofdespair-0.0.1
PoC: Wheel-of-Despair code execution.

Code of Conduct

@calebbrown calebbrown added S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior labels Nov 14, 2024
@calebbrown
Copy link
Contributor Author

Just a couple more additions:

  • this behavior is related to pip installing into the same location that pip is running from

  • this may have security implications based on the usage of pip by users. For example, pip install --only-binary :all: could be used in a trusted context, before using the installed packages in an untrusted context (e.g. different stages in a build pipeline).

@calebbrown calebbrown changed the title Lazy import allows malicious wheel to execute code on install. Lazy import allows wheel to execute code on install. Nov 14, 2024
@ichard26 ichard26 added type: security Has potential security implications and removed S: needs triage Issues/PRs that need to be triaged labels Nov 14, 2024
@ichard26
Copy link
Member

I'll note that there are other ways to compromise pip. A malicious wheel could replace a key file used by pip, which is then picked up on the next invocation. Or they could replace the pip script on PATH. Etc.

But yeah, this does make it easier to achieve arbitrary code execution as it only requires one invocation. We already eagerly import the self-check module when upgrading pip (to avoid crashes). It would be reasonable to always import the module eagerly in the install command module.

if modifying_pip:
# Eagerly import this module to avoid crashes. Otherwise, this
# module would be imported *after* pip was replaced, resulting in
# crashes if the new self_outdated_check module was incompatible
# with the rest of pip that's already imported.
import pip._internal.self_outdated_check # noqa: F401

Feel free to send a PR. Thanks for investigating and letting us know!

P.S. I haven't looked at this in detail, but I suspect there are other lazy imports in the codebase. Not sure if they're suspectible to ACE or not.

@calebbrown
Copy link
Contributor Author

Thanks @ichard26 for the quick triage.

Looking at strace during pip install, the only other import I can see is pip._internal.utils.entrypoints but that appears to be imported through pip._internal.self_outdated_check.

I'll create a PR for this, but would you still like to keep the lazy loading except for install (i.e. remove the if modifying_pip condition but keep the import where it is), or would you prefer to make it non-lazy globally and import at the top of pip._internal.cli.index_command?

@ichard26
Copy link
Member

ichard26 commented Nov 14, 2024

The import was made lazy in order to avoid importing the entire network and index (HTML) parsing stack. This improves start-up time for the commands that don't need these components. For example, pip list is an index command, but usually does not access the network at all and thus should not perform a self-check or import the machinery needed for the self-check. The tricky part is that a command like pip list --outdated does require the network and can perform a self-check. This makes an eager import at the top of cli.index_command unacceptable.

(i.e. remove the if modifying_pip condition but keep the import where it is)

It'd probably be more robust to simply import the self-check at the top of commands.install.

@di
Copy link
Member

di commented Nov 14, 2024

Would definitely be great to fix this if possible, but I'm curious about setting a precedent here: is this behavior pip would be willing to guarantee even if the wheel spec does not specifically address it? Or is this only a best-effort fix?

If the goal is to guarantee the behavior, maybe @calebbrown you would be willing to help write a test here that would prevent a future regression, and this could be documented as well?

@pfmoore
Copy link
Member

pfmoore commented Nov 14, 2024

I don't think we'd want to guarantee this.

The fact that a wheel can install files for an arbitrary import package is a feature, not a bug1 - pillow installs PIL, setuptools installs pkg_resources, etc. The fact that pip allows a wheel to install files that overwrite those of an existing package is a known issue, and #4625 is tracking this. As you'll notice if you read that issue, it's not a trivial problem to fix. The fact that "lazy" imports2 are affected if you alter the contents of sys.path while the program is running is a feature of Python's import system.

So while I'd be fine with a change that removes this specific issue, and as a result reduces the risk of problems, I don't think it's something we should try to guarantee. Users need to understand that when they install a wheel, it can affect the behaviour of both programs they subsequently run, and currently running programs. That isn't just pip - to give another example, if you have a service running from a Python environment and you install something new in that environment, the service can be affected. Ultimately, it is the user's responsibility to ensure that they only install trusted packages.

If someone wanted to write a section for the packaging user guide covering the trust and threat models for Python packaging, I'm sure that would be extremely useful.

Footnotes

  1. Although it's a feature that's open to abuse, and we could consider changing it, if anyone had the stomach for addressing the backward compatibility issues.

  2. They aren't technically "lazy", they just aren't done at program startup.

@di
Copy link
Member

di commented Nov 14, 2024

At the risk of getting quoted if/when this gets used by a bad actor: I would argue that we shouldn't fix things we don't plan to keep fixed. If this is just a subclass of #4625 and would be resolved there, seems like this would be considered a duplicate of that issue, even if it's a novel path to reproduce it.

@pfmoore
Copy link
Member

pfmoore commented Nov 14, 2024

I think the vector of attacking the running instance of pip is unexpected enough that we should cover ourselves against it. There's no point making things easier for attackers. I just don't think we should guarantee it, as that suggests that users don't need to worry about this. And honestly, I think that anyone who is genuinely concerned about Python security should be made more aware that it's their responsibility to do due dilligence, rather than assuming that volunteer projects are willing to cover that for them.

To that end, I'd strongly favour adding a security section to the packaging user guide, as I mentioned above. But I have neither the expertise nor the time to write such a thing myself.

@calebbrown
Copy link
Contributor Author

As an outsider, a potential solution that would solve this for pip would be to prevent any package other than pip from updating pip.

This would leave #4625 unaddressed, but protect pip.

@pfmoore
Copy link
Member

pfmoore commented Nov 14, 2024

I can come up with attack vectors that would get past this (drop a .pth file into site-packages that modifies sys.path to put an override to pip ahead of pip itself, for example). So my position is unchanged - I think it's OK to protect what we can, but we shouldn't give anyone the impression that we guarantee that no harm is possible.

@ichard26 ichard26 added this to the 25.0 milestone Nov 17, 2024
@ichard26
Copy link
Member

I'm going to tack this onto the 25.0 milestone so we don't forget to address this trivially fixable vulnerability. Larger discussions on pip security can occur later.

calebbrown added a commit to calebbrown/pip that referenced this issue Nov 18, 2024
Fixes pypa#13079.

Signed-off-by: Caleb Brown <calebbrown@google.com>
ichard26 pushed a commit that referenced this issue Dec 7, 2024
The comment was preserved as it is still relevant, but a note about
preventing arbitrary code execution was added. See #13079 for the
security bug report.

Signed-off-by: Caleb Brown <calebbrown@google.com>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 7, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: bug A confirmed bug or unintended behavior type: security Has potential security implications
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants