-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to using a virtual environment for app dependencies #253
Comments
One thing I forgot to add: Switching to venvs is now only possible because pip 22.3 added support for a new See: |
App dependencies are now installed into a virtual environment (aka venv or virtualenv) instead of into a custom user site-packages location. This: 1. Avoids user site-packages compatibility issues with some packages when using relocated Python (see #253) 2. Improves parity with how dependencies will be installed when using Poetry in the future (since Poetry doesn't support `--user`) 3. Unblocks being able to move pip into its own layer (see #254) This approach is possible since pip 22.3+ supports a new `--python` / `PIP_PYTHON` option which can be used to make pip operate against a different environment to the one in which it is installed. This allow us to continuing keeping pip in a separate layer to the app dependencies (currently the Python layer, but in a later PR pip will be moved to its own layer). Now that app dependencies are installed into a venv, we no longer need to make the system site-packages directory read-only to protect against later buildpacks installing into the wrong location. This has been split out of the Poetry PR for easier review. See also: - https://docs.python.org/3/library/venv.html - https://pip.pypa.io/en/stable/cli/pip/#cmdoption-python Closes #253. GUS-W-16616226.
The Python package manager Poetry is now supported for installing app dependencies: https://python-poetry.org To use Poetry, apps must have a `poetry.lock` lockfile, which can be created by running `poetry lock` locally, after adding Poetry config to `pyproject.toml` (which can be done either manually or by using `poetry init`). Apps must only have one package manager file (either `requirements.txt` or `poetry.lock`, but not both) otherwise the buildpack will abort the build with an error (which will help prevent some of the types of support tickets we see in the classic buildpack with users unknowingly mixing and matching pip + Pipenv). Poetry is installed into a build-only layer (to reduce the final app image size), so is not available at run-time. The app dependencies are installed into a virtual environment (the same as for pip after #257, for the reasons described in #253), which is on `PATH` so does not need explicit activation when using the app image. As such, use of `poetry run` or `poetry shell` is not required at run-time to use dependencies in the environment. When using Poetry, pip is not installed (possible thanks to #258), since Poetry includes its own internal vendored copy that it will use instead (for the small number of Poetry operations for which it still calls out to pip, such as package uninstalls). Both the Poetry and app dependencies layers are cached, however, the Poetry download/wheel cache is not cached, since using it is slower than caching the dependencies layer (for more details see the comments on `poetry_dependencies::install_dependencies`). The `poetry install --sync` command is run using `--only main` so as to only install the main `[tool.poetry.dependencies]` dependencies group from `pyproject.toml`, and not any of the app's other dependency groups (such as test/dev groups, eg `[tool.poetry.group.test.dependencies]`). I've marked this `semver: major` since in the (probably unlikely) event there are any early-adopter projects using this CNB that have both a `requirements.txt` and `poetry.lock` then this change will cause them to error (until one of the files is deleted). Relevant Poetry docs: - https://python-poetry.org/docs/cli/#install - https://python-poetry.org/docs/configuration/ - https://python-poetry.org/docs/managing-dependencies/#dependency-groups Work that will be handled later: - Support for selecting Python version via `tool.poetry.dependencies.python`: #260 - Build output and error messages polish/CX review (this will be performed when switching the buildpack to the new logging style). - More detailed user-facing docs: #11 Closes #7. GUS-W-9607867. GUS-W-9608286. GUS-W-9608295.
The Python package manager Poetry is now supported for installing app dependencies: https://python-poetry.org To use Poetry, apps must have a `poetry.lock` lockfile, which can be created by running `poetry lock` locally, after adding Poetry config to `pyproject.toml` (which can be done either manually or by using `poetry init`). Apps must only have one package manager file (either `requirements.txt` or `poetry.lock`, but not both) otherwise the buildpack will abort the build with an error (which will help prevent some of the types of support tickets we see in the classic buildpack with users unknowingly mixing and matching pip + Pipenv). Poetry is installed into a build-only layer (to reduce the final app image size), so is not available at run-time. The app dependencies are installed into a virtual environment (the same as for pip after #257, for the reasons described in #253), which is on `PATH` so does not need explicit activation when using the app image. As such, use of `poetry run` or `poetry shell` is not required at run-time to use dependencies in the environment. When using Poetry, pip is not installed (possible thanks to #258), since Poetry includes its own internal vendored copy that it will use instead (for the small number of Poetry operations for which it still calls out to pip, such as package uninstalls). Both the Poetry and app dependencies layers are cached, however, the Poetry download/wheel cache is not cached, since using it is slower than caching the dependencies layer (for more details see the comments on `poetry_dependencies::install_dependencies`). The `poetry install --sync` command is run using `--only main` so as to only install the main `[tool.poetry.dependencies]` dependencies group from `pyproject.toml`, and not any of the app's other dependency groups (such as test/dev groups, eg `[tool.poetry.group.test.dependencies]`). I've marked this `semver: major` since in the (probably unlikely) event there are any early-adopter projects using this CNB that have both a `requirements.txt` and `poetry.lock` then this change will cause them to error (until one of the files is deleted). Relevant Poetry docs: - https://python-poetry.org/docs/cli/#install - https://python-poetry.org/docs/configuration/ - https://python-poetry.org/docs/managing-dependencies/#dependency-groups Work that will be handled later: - Support for selecting Python version via `tool.poetry.dependencies.python`: #260 - Build output and error messages polish/CX review (this will be performed when switching the buildpack to the new logging style). - More detailed user-facing docs: #11 Closes #7. GUS-W-9607867. GUS-W-9608286. GUS-W-9608295.
When using a
Dockerfile
, the file content contributed by different steps in the build is split into different layers, which are then combined via use of an overlay filesystem. In this model, it's possible for multiple steps of the build to write to the same directory locations - albeit at the cost of changes in earlier layers triggering cache invalidation of later layers.With CNBs, the file content contributed by different steps in the build (whether that be from separate buildpacks, or steps within the same buildpack) are kept separate via the concept of CNB layers:
https://buildpacks.io/docs/for-buildpack-authors/concepts/layer/
This provides several advantages (finer grained caching; easier multi-language images etc), however, to take full advantage of them we have to write the build content to separate layer directories.
For Python, this means we cannot simply install everything into the system site-packages directory (which lives inside the Python installation directory).
Until now, the way we've handled this is by:
pip install --user
combined withPYTHONUSERBASE
(which changes the user site-packages location from its default of the user home directory, to the location of the dependencies layer)However, this has a number of downsides:
--user
installs when using relocated Python, and otherwise require other workarounds (such as settingPYTHONHOME
). eg: Incorrect stdlib path for relocated Python installs, resulting inModuleNotFoundError: No module named 'encodings'
unbit/uwsgi#2525--user
installs (such as Poetry or uv), meaning when we add support for them, we would have to use a different approach for them - which would then mean app dependency environments are set up differently depending on what package manager an app uses, which doesn't seem ideal.Given that PEP-405 style virtual environments (venvs) are:
--user
(and therefore the better tested path)...then it makes more sense to use a venv for the app dependencies instead of a user install.
Note: We can't use
PYTHONPATH
instead of a user site-packages install, since any directories specified viaPYTHONPATH
are given a higher precedence in Python'ssys.path
than the Python stdlib (unlike system and user site-packages, which are added tosys.path
after the Python stdlib). This can then cause hard to debug issues if apps use outdated backport libraries (which can often happen unintentionally via broken/suboptimal packages in their transitive dependency tree).GUS-W-16616226.
The text was updated successfully, but these errors were encountered: