Skip to content

Commit

Permalink
Install app dependencies into a virtual environment
Browse files Browse the repository at this point in the history
App dependencies are now installed into a virtual environment (aka venv
or virtualenv) instead of into a custom user site-packages location.

This:
1. Avoids user site-packages compatibility issues with some packages
   when using relocated Python (see #253)
2. Improves parity with how dependencies will be installed when using
   Poetry in the future (since Poetry doesn't support `--user`)
3. Unblocks being able to move pip into its own layer (see #254)

This approach is possible since pip 22.3+ supports a new `--python`
/ `PIP_PYTHON` option which can be used to make pip operate against
a different environment to the one in which it is installed. This
allow us to continuing keeping pip in a separate layer to the app
dependencies (currently the Python layer, but in a later PR pip will
be moved to its own layer).

Now that app dependencies are installed into a venv, we no longer need
to make the system site-packages directory read-only to protect against
later buildpacks installing into the wrong location.

This has been split out of the Poetry PR for easier review.

See also:
- https://docs.python.org/3/library/venv.html
- https://pip.pypa.io/en/stable/cli/pip/#cmdoption-python

Closes #253.
GUS-W-16616226.
  • Loading branch information
edmorley committed Aug 30, 2024
1 parent f005cd3 commit 3c0c316
Show file tree
Hide file tree
Showing 8 changed files with 93 additions and 126 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Changed

- App dependencies are now installed into a virtual environment instead of user site-packages. ([#257](https://github.com/heroku/buildpacks-python/pull/257))

## [0.15.0] - 2024-08-07

### Changed
Expand Down
25 changes: 18 additions & 7 deletions src/errors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -171,11 +171,6 @@ fn on_python_layer_error(error: PythonLayerError) {
"locating the pip wheel file bundled inside the Python 'ensurepip' module",
&io_error,
),
PythonLayerError::MakeSitePackagesReadOnly(io_error) => log_io_error(
"Unable to make site-packages directory read-only",
"modifying the permissions on Python's 'site-packages' directory",
&io_error,
),
// This error will change once the Python version is validated against a manifest.
// TODO: (W-12613425) Write the supported Python versions inline, instead of linking out to Dev Center.
// TODO: Decide how to explain to users how stacks, base images and builder images versions relate to each other.
Expand All @@ -196,6 +191,22 @@ fn on_python_layer_error(error: PythonLayerError) {

fn on_pip_dependencies_layer_error(error: PipDependenciesLayerError) {
match error {
PipDependenciesLayerError::CreateVenvCommand(error) => match error {
StreamedCommandError::Io(io_error) => log_io_error(
"Unable to create virtual environment",
"running 'python -m venv' to create a virtual environment",
&io_error,
),
StreamedCommandError::NonZeroExitStatus(exit_status) => log_error(
"Unable to create virtual environment",
formatdoc! {"
The 'python -m venv' command to create a virtual environment did
not exit successfully ({exit_status}).
See the log output above for more information.
"},
),
},
PipDependenciesLayerError::PipInstallCommand(error) => match error {
StreamedCommandError::Io(io_error) => log_io_error(
"Unable to install dependencies using pip",
Expand All @@ -207,8 +218,8 @@ fn on_pip_dependencies_layer_error(error: PipDependenciesLayerError) {
StreamedCommandError::NonZeroExitStatus(exit_status) => log_error(
"Unable to install dependencies using pip",
formatdoc! {"
The 'pip install' command to install the application's dependencies from
'requirements.txt' failed ({exit_status}).
The 'pip install -r requirements.txt' command to install the app's
dependencies failed ({exit_status}).
See the log output above for more information.
"},
Expand Down
123 changes: 44 additions & 79 deletions src/layers/pip_dependencies.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,18 @@ use libcnb::layer::UncachedLayerDefinition;
use libcnb::layer_env::{LayerEnv, ModificationBehavior, Scope};
use libcnb::Env;
use libherokubuildpack::log::log_info;
use std::path::{Path, PathBuf};
use std::path::PathBuf;
use std::process::Command;

/// Creates a layer containing the application's Python dependencies, installed using pip.
//
// To do this we use `pip install --user` so that the dependencies are installed into the user
// `site-packages` directory in this layer (set by `PYTHONUSERBASE`), rather than the system
// `site-packages` subdirectory of the Python installation layer.
//
// Note: We can't instead use pip's `--target` option along with `PYTHONPATH`, since:
// - Directories on `PYTHONPATH` take precedence over the Python stdlib (unlike the system or
// user site-packages directories), which can cause hard to debug stdlib shadowing issues
// if one of the app's transitive dependencies is an outdated stdlib backport package.
// - `--target` has bugs, eg: <https://github.com/pypa/pip/issues/8799>
// We install into a virtual environment since:
// - We can't install into the system site-packages inside the main Python directory since
// we need the app dependencies to be in their own layer.
// - Some packages are broken with `--user` installs when using relocated Python, and
// otherwise require other workarounds. eg: https://github.com/unbit/uwsgi/issues/2525
// - PEP-405 style venvs are very lightweight and are also much more frequently
// used in the wild compared to `--user`, and therefore the better tested path.
//
// This layer is not cached, since:
// - pip is a package installer rather than a project/environment manager, and so does not
Expand All @@ -35,38 +33,59 @@ pub(crate) fn install_dependencies(
env: &mut Env,
) -> Result<PathBuf, libcnb::Error<BuildpackError>> {
let layer = context.uncached_layer(
layer_name!("dependencies"),
// The name of this layer must be alphabetically after that of the `python` layer so that
// this layer's `bin/` directory (and thus `python` symlink) is listed first in `PATH`:
// https://github.com/buildpacks/spec/blob/main/buildpack.md#layer-paths
layer_name!("venv"),
UncachedLayerDefinition {
build: true,
launch: true,
},
)?;

let layer_path = layer.path();
let layer_env = generate_layer_env(&layer_path);

log_info("Creating virtual environment");
utils::run_command_and_stream_output(
Command::new("python")
.args(["-m", "venv", "--without-pip", &layer_path.to_string_lossy()])
.env_clear()
.envs(&*env),
)
.map_err(PipDependenciesLayerError::CreateVenvCommand)?;

let mut layer_env = LayerEnv::new()
// Since pip is installed in a different layer (outside of this venv), we have to explicitly
// tell it to perform operations against this venv instead of the global Python install.
// https://pip.pypa.io/en/stable/cli/pip/#cmdoption-python
.chainable_insert(
Scope::All,
ModificationBehavior::Override,
"PIP_PYTHON",
&layer_path,
)
// For parity with the venv's `bin/activate` script:
// https://docs.python.org/3/library/venv.html#how-venvs-work
.chainable_insert(
Scope::All,
ModificationBehavior::Override,
"VIRTUAL_ENV",
&layer_path,
);
layer.write_env(&layer_env)?;
// Required to pick up the automatic PATH env var. See: https://github.com/heroku/libcnb.rs/issues/842
layer_env = layer.read_env()?;
env.clone_from(&layer_env.apply(Scope::Build, env));

log_info("Running pip install");

log_info("Running 'pip install -r requirements.txt'");
utils::run_command_and_stream_output(
Command::new("pip")
.args([
"install",
"--no-input",
"--progress-bar",
"off",
// Using `--user` rather than `PIP_USER` since the latter affects `pip list` too.
"--user",
"--requirement",
"requirements.txt",
// For VCS dependencies installed in editable mode, the repository clones must be
// kept after installation, since their directories are added to the Python path
// directly (via `.pth` files in `site-packages`). By default pip will store the
// repositories in the current working directory (the app dir), but we want them
// in the dependencies layer instead.
"--src",
&layer_path.join("src").to_string_lossy(),
])
.current_dir(&context.app_dir)
.env_clear()
Expand All @@ -77,35 +96,10 @@ pub(crate) fn install_dependencies(
Ok(layer_path)
}

fn generate_layer_env(layer_path: &Path) -> LayerEnv {
LayerEnv::new()
// We set `PATH` explicitly, since lifecycle will only add the bin directory to `PATH` if it
// exists - and we want to support the scenario of installing a debugging package with CLI at
// run-time, when none of the dependencies installed at build-time had an entrypoint script.
.chainable_insert(
Scope::All,
ModificationBehavior::Prepend,
"PATH",
layer_path.join("bin"),
)
.chainable_insert(Scope::All, ModificationBehavior::Delimiter, "PATH", ":")
// Overrides the default user base directory, used by Python to compute the path of the user
// `site-packages` directory. Setting this:
// - Makes `pip install --user` install the dependencies into the current layer rather
// than the user's home directory (which would be discarded at the end of the build).
// - Allows Python to find the installed packages at import time.
// See: https://docs.python.org/3/using/cmdline.html#envvar-PYTHONUSERBASE
.chainable_insert(
Scope::All,
ModificationBehavior::Override,
"PYTHONUSERBASE",
layer_path,
)
}

/// Errors that can occur when installing the project's dependencies into a layer using pip.
#[derive(Debug)]
pub(crate) enum PipDependenciesLayerError {
CreateVenvCommand(StreamedCommandError),
PipInstallCommand(StreamedCommandError),
}

Expand All @@ -114,32 +108,3 @@ impl From<PipDependenciesLayerError> for libcnb::Error<BuildpackError> {
Self::BuildpackError(BuildpackError::PipDependenciesLayer(error))
}
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn pip_dependencies_layer_env() {
let mut base_env = Env::new();
base_env.insert("PATH", "/base");
base_env.insert("PYTHONUSERBASE", "this-should-be-overridden");

let layer_env = generate_layer_env(Path::new("/layer-dir"));

assert_eq!(
utils::environment_as_sorted_vector(&layer_env.apply(Scope::Build, &base_env)),
[
("PATH", "/layer-dir/bin:/base"),
("PYTHONUSERBASE", "/layer-dir"),
]
);
assert_eq!(
utils::environment_as_sorted_vector(&layer_env.apply(Scope::Launch, &base_env)),
[
("PATH", "/layer-dir/bin:/base"),
("PYTHONUSERBASE", "/layer-dir"),
]
);
}
}
13 changes: 0 additions & 13 deletions src/layers/python.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,6 @@ use libcnb::layer_env::{LayerEnv, ModificationBehavior, Scope};
use libcnb::Env;
use libherokubuildpack::log::log_info;
use serde::{Deserialize, Serialize};
use std::fs::Permissions;
use std::os::unix::prelude::PermissionsExt;
use std::path::{Path, PathBuf};
use std::process::Command;
use std::{fs, io};
Expand Down Expand Up @@ -138,16 +136,6 @@ pub(crate) fn install_python_and_packaging_tools(
)
.map_err(PythonLayerError::BootstrapPipCommand)?;

// By default pip installs into the system site-packages directory if it is writeable by the
// current user. Whilst the buildpack's own `pip install` invocations always use `--user` to
// ensure app dependencies are installed into the user site-packages, it's possible other
// buildpacks or custom scripts may forget to do so. By making the system site-packages
// directory read-only, pip will automatically use user installs in such cases:
// https://github.com/pypa/pip/blob/24.1.2/src/pip/_internal/commands/install.py#L662-L720
let site_packages_dir = python_stdlib_dir.join("site-packages");
fs::set_permissions(site_packages_dir, Permissions::from_mode(0o555))
.map_err(PythonLayerError::MakeSitePackagesReadOnly)?;

Ok(())
}

Expand Down Expand Up @@ -369,7 +357,6 @@ pub(crate) enum PythonLayerError {
BootstrapPipCommand(StreamedCommandError),
DownloadUnpackPythonArchive(DownloadUnpackArchiveError),
LocateBundledPip(io::Error),
MakeSitePackagesReadOnly(io::Error),
PythonArchiveNotFound { python_version: PythonVersion },
}

Expand Down
2 changes: 1 addition & 1 deletion tests/fixtures/pip_editable_git_compiled/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This requirement uses a VCS URL and `-e` in order to test that:
# - Git from the stack image can be found (ie: the system PATH has been correctly propagated to pip).
# - The editable mode repository clone is saved into the dependencies layer (via the `--src` option).
# - The editable mode repository clone is saved into the dependencies layer.
#
# A C-based package is used instead of a pure Python package, in order to test that the
# Python headers can be found in the `include/pythonX.Y/` directory of the Python layer.
Expand Down
3 changes: 0 additions & 3 deletions tests/fixtures/testing_buildpack/bin/build
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,6 @@
# - Python's sys.path is correct.
# - The correct version of pip was installed.
# - Both the package manager and Python can find the typing-extensions package.
# - The system site-packages directory is protected against running 'pip install'
# without having passed '--user'.
# - The typing-extensions package was installed into a separate dependencies layer.

set -euo pipefail
Expand All @@ -20,5 +18,4 @@ python -c 'import pprint, sys; pprint.pp(sys.path)'
echo
pip --version
pip list
pip install --dry-run typing-extensions
python -c 'import typing_extensions; print(typing_extensions)'
1 change: 1 addition & 0 deletions tests/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ fn default_build_config(fixture_path: impl AsRef<Path>) -> BuildConfig {
("PYTHONHOME", "/invalid"),
("PYTHONPATH", "/invalid"),
("PYTHONUSERBASE", "/invalid"),
("VIRTUAL_ENV", "/invalid"),
]);

config
Expand Down
Loading

0 comments on commit 3c0c316

Please sign in to comment.