Enable model custom dependency installation using virtual environment #2910

namannandan · 2024-01-26T23:31:20Z

Description

The current implementation to handle installation of custom dependencies using a requirements.txt file installs packages to a target directory as follows:

serve/frontend/server/src/main/java/org/pytorch/serve/wlm/ModelManager.java

Lines 226 to 234 in a07b7d9

    
           commandParts.add(pythonRuntime); 
        
           commandParts.add("-m"); 
        
           commandParts.add("pip"); 
        
           commandParts.add("install"); 
        
           commandParts.add("-U"); 
        
           commandParts.add("-t"); 
        
           commandParts.add(dependencyPath.getAbsolutePath()); 
        
           commandParts.add("-r"); 
        
           commandParts.add(requirementsFilePath.toString());

And includes the target directory in the PYTHONPATH environment variable:

serve/frontend/server/src/main/java/org/pytorch/serve/util/messages/EnvironmentUtils.java

Lines 52 to 58 in a07b7d9

    
               File dependencyPath = new File(modelPath); 
        
               if (Files.isSymbolicLink(dependencyPath.toPath())) { 
        
                   pythonPath 
        
                           .append(dependencyPath.getParentFile().getAbsolutePath()) 
        
                           .append(File.pathSeparatorChar); 
        
               } 
        
           }

The outcome of this approach is:

The custom dependencies specified in requirements.txt along with a copy of all their dependencies are installed to the target directory irrespective of the packages and their dependencies already available in the base python environment(site-packages).
The most up to date supported dependencies of the packages in requirements.txt is installed in the target directory irrespective of a supported version already being present in the base python environment(site-packages).

The above approach can be improved for the following reasons:

Supported dependencies already present in the base python environment can be reused and don't need to be downloaded and re-installed.
In containerized environments that ship with specific package versions, for ex: specific version of torch binary that supports the combination of cuda version + GPU driver version on the target platform, it is desirable to use the pip package already available in the base python environment rather than upgrade to the most up to date version which may break compatibility with existing packages.

For example:
In the pytorch inference deep learning container 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-inference:2.1.0-gpu-py310 from https://github.com/aws/deep-learning-containers/blob/master/available_images.md

$ pip freeze | grep torch
sagemaker-pytorch-inference==2.0.21
torch @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torch-2.1.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=93960a77d4b72fb6d32036912d8193a3a159b1b38f342c4e2b5ac82d279eff5a
torch-model-archiver @ file:///serve/model-archiver
torch-workflow-archiver @ file:///serve/workflow-archiver
torchaudio @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchaudio-2.1.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=56b8ca0bd6b72edb55fbd06114d94e9d3f3c4daf8d456a0dc929d072105d75ee
torchdata @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchdata-0.7.0%2B7c7597b-cp310-cp310-linux_x86_64.whl#sha256=f33afa5a8f8f6979fb7f35cd53e61da1da81a6be852985c2bf6cb4c9bb7fed94
torchserve @ file:///serve
torchtext @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchtext-0.16.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=2fe28a2da3a194e553eeda3683955f4a0821ba732fe10bcb5895c3293525807f
torchvision @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchvision-0.16.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=bf8bfb5351e8d02591bf8833ca6df7621f060d4cb238d6c32511579e53507acb

$ cat test_req.txt 
segment-anything-py==1.0
opencv-python-headless==4.7.0.68
matplotlib==3.6.3

$ mkdir /tmp/deps
$ pip install -U --upgrade-strategy only-if-needed -t /tmp/deps -r test_req.txt

$ ls /tmp/deps
Jinja2-3.1.3.dist-info              idna                                       nvidia_cuda_nvrtc_cu12-12.1.105.dist-info    segment_anything
MarkupSafe-2.1.4.dist-info          idna-3.6.dist-info                         nvidia_cuda_runtime_cu12-12.1.105.dist-info  segment_anything_py-1.0.dist-info
PIL                                 isympy.py                                  nvidia_cudnn_cu12-8.9.2.26.dist-info         share
__pycache__                         jinja2                                     nvidia_cufft_cu12-11.0.2.54.dist-info        six-1.16.0.dist-info
bin                                 kiwisolver                                 nvidia_curand_cu12-10.3.2.106.dist-info      six.py
certifi                             kiwisolver-1.4.5.dist-info                 nvidia_cusolver_cu12-11.4.5.107.dist-info    sympy
certifi-2023.11.17.dist-info        markupsafe                                 nvidia_cusparse_cu12-12.1.0.106.dist-info    sympy-1.12.dist-info
charset_normalizer                  matplotlib                                 nvidia_nccl_cu12-2.18.1.dist-info            torch
charset_normalizer-3.3.2.dist-info  matplotlib-3.6.3-py3.10-nspkg.pth          nvidia_nvjitlink_cu12-12.3.101.dist-info     torch-2.1.2.dist-info
contourpy                           matplotlib-3.6.3.dist-info                 nvidia_nvtx_cu12-12.1.105.dist-info          torchgen
contourpy-1.2.0.dist-info           mpl_toolkits                               opencv_python_headless-4.7.0.68.dist-info    torchvision
cv2                                 mpmath                                     opencv_python_headless.libs                  torchvision-0.16.2.dist-info
cycler                              mpmath-1.3.0.dist-info                     packaging                                    torchvision.libs
cycler-0.12.1.dist-info             networkx                                   packaging-23.2.dist-info                     triton
dateutil                            networkx-3.2.1.dist-info                   pillow-10.2.0.dist-info                      triton-2.1.0.dist-info
filelock                            numpy                                      pillow.libs                                  typing_extensions-4.9.0.dist-info
filelock-3.13.1.dist-info           numpy-1.26.3.dist-info                     pylab.py                                     typing_extensions.py
fontTools                           numpy.libs                                 pyparsing                                    urllib3
fonttools-4.47.2.dist-info          nvfuser                                    pyparsing-3.1.1.dist-info                    urllib3-2.1.0.dist-info
fsspec                              nvidia                                     python_dateutil-2.8.2.dist-info
fsspec-2023.12.2.dist-info          nvidia_cublas_cu12-12.1.3.1.dist-info      requests
functorch                           nvidia_cuda_cupti_cu12-12.1.105.dist-info  requests-2.31.0.dist-info

Although torch-2.1.0+cu118 is already available in the base python environment and supports segment-anything-py==1.0, torch-2.1.2+cu121 is installed, which may break compatibility with the GPU driver on the host. This is expected behavior since when installing pip packages to a target directory, the requested packages and all their latest dependencies will get installed. Further, since the target directory is added to PYTHONPATH, torch-2.1.2+cu121 masks torch-2.1.0+cu118.

On the other hand, when using a virtual environment with access to system site packages to install custom dependencies, we see the following:

# original torch packages
root@b323f9ef66ce:/# pip freeze | grep torch
sagemaker-pytorch-inference==2.0.21
torch @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torch-2.1.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=93960a77d4b72fb6d32036912d8193a3a159b1b38f342c4e2b5ac82d279eff5a
torch-model-archiver @ file:///serve/model-archiver
torch-workflow-archiver @ file:///serve/workflow-archiver
torchaudio @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchaudio-2.1.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=56b8ca0bd6b72edb55fbd06114d94e9d3f3c4daf8d456a0dc929d072105d75ee
torchdata @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchdata-0.7.0%2B7c7597b-cp310-cp310-linux_x86_64.whl#sha256=f33afa5a8f8f6979fb7f35cd53e61da1da81a6be852985c2bf6cb4c9bb7fed94
torchserve @ file:///serve
torchtext @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchtext-0.16.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=2fe28a2da3a194e553eeda3683955f4a0821ba732fe10bcb5895c3293525807f
torchvision @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchvision-0.16.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=bf8bfb5351e8d02591bf8833ca6df7621f060d4cb238d6c32511579e53507acb

# create virtual env
root@b323f9ef66ce:/# python -m venv --system-site-packages ./venvs/test
root@b323f9ef66ce:/# source ./venvs/test/bin/activate

(test) root@b323f9ef66ce:/# cat test_req.txt 
segment-anything-py==1.0
opencv-python-headless==4.7.0.68
matplotlib==3.6.3

(test) root@b323f9ef66ce:/# python -m pip install --upgrade --upgrade-strategy only-if-needed -r test_req.txt
Collecting segment-anything-py==1.0
  Downloading segment_anything_py-1.0-py3-none-any.whl (40 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.2/40.2 kB 5.1 MB/s eta 0:00:00
Collecting opencv-python-headless==4.7.0.68
  Downloading opencv_python_headless-4.7.0.68-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (49.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.2/49.2 MB 41.7 MB/s eta 0:00:00
Collecting matplotlib==3.6.3
  Downloading matplotlib-3.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.8/11.8 MB 103.3 MB/s eta 0:00:00
Requirement already satisfied: torch>=1.7 in /opt/conda/lib/python3.10/site-packages (from segment-anything-py==1.0->-r test_req.txt (line 1)) (2.1.0+cu118)
Requirement already satisfied: torchvision>=0.8 in /opt/conda/lib/python3.10/site-packages (from segment-anything-py==1.0->-r test_req.txt (line 1)) (0.16.0+cu118)
Requirement already satisfied: numpy>=1.17.0 in /opt/conda/lib/python3.10/site-packages (from opencv-python-headless==4.7.0.68->-r test_req.txt (line 2)) (1.24.4)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib==3.6.3->-r test_req.txt (line 3)) (1.4.5)
Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib==3.6.3->-r test_req.txt (line 3)) (1.2.0)
Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.10/site-packages (from matplotlib==3.6.3->-r test_req.txt (line 3)) (4.47.0)
Requirement already satisfied: python-dateutil>=2.7 in /opt/conda/lib/python3.10/site-packages (from matplotlib==3.6.3->-r test_req.txt (line 3)) (2.8.2)
Requirement already satisfied: pyparsing>=2.2.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib==3.6.3->-r test_req.txt (line 3)) (3.1.1)
Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.10/site-packages (from matplotlib==3.6.3->-r test_req.txt (line 3)) (0.12.1)
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.10/site-packages (from matplotlib==3.6.3->-r test_req.txt (line 3)) (23.2)
Requirement already satisfied: pillow>=6.2.0 in /opt/conda/lib/python3.10/site-packages (from matplotlib==3.6.3->-r test_req.txt (line 3)) (10.2.0)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib==3.6.3->-r test_req.txt (line 3)) (1.16.0)
Requirement already satisfied: filelock in /opt/conda/lib/python3.10/site-packages (from torch>=1.7->segment-anything-py==1.0->-r test_req.txt (line 1)) (3.13.1)
Requirement already satisfied: jinja2 in /opt/conda/lib/python3.10/site-packages (from torch>=1.7->segment-anything-py==1.0->-r test_req.txt (line 1)) (3.1.2)
Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch>=1.7->segment-anything-py==1.0->-r test_req.txt (line 1)) (1.12)
Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from torch>=1.7->segment-anything-py==1.0->-r test_req.txt (line 1)) (3.2.1)
Requirement already satisfied: fsspec in /opt/conda/lib/python3.10/site-packages (from torch>=1.7->segment-anything-py==1.0->-r test_req.txt (line 1)) (2023.12.2)
Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch>=1.7->segment-anything-py==1.0->-r test_req.txt (line 1)) (4.9.0)
Requirement already satisfied: requests in /opt/conda/lib/python3.10/site-packages (from torchvision>=0.8->segment-anything-py==1.0->-r test_req.txt (line 1)) (2.31.0)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from jinja2->torch>=1.7->segment-anything-py==1.0->-r test_req.txt (line 1)) (2.1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.10/site-packages (from requests->torchvision>=0.8->segment-anything-py==1.0->-r test_req.txt (line 1)) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.10/site-packages (from requests->torchvision>=0.8->segment-anything-py==1.0->-r test_req.txt (line 1)) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.10/site-packages (from requests->torchvision>=0.8->segment-anything-py==1.0->-r test_req.txt (line 1)) (1.26.18)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.10/site-packages (from requests->torchvision>=0.8->segment-anything-py==1.0->-r test_req.txt (line 1)) (2023.11.17)
Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torch>=1.7->segment-anything-py==1.0->-r test_req.txt (line 1)) (1.3.0)
Installing collected packages: opencv-python-headless, matplotlib, segment-anything-py
  Attempting uninstall: matplotlib
    Found existing installation: matplotlib 3.8.2
    Not uninstalling matplotlib at /opt/conda/lib/python3.10/site-packages, outside environment /venvs/test
    Can't uninstall 'matplotlib'. No files were found to uninstall.
Successfully installed matplotlib-3.6.3 opencv-python-headless-4.7.0.68 segment-anything-py-1.0

[notice] A new release of pip available: 22.3.1 -> 23.3.2
[notice] To update, run: pip install --upgrade pip

# Same torch packages are visible in venv
(test) root@b323f9ef66ce:/# pip freeze | grep torch
sagemaker-pytorch-inference==2.0.21
torch @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torch-2.1.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=93960a77d4b72fb6d32036912d8193a3a159b1b38f342c4e2b5ac82d279eff5a
torch-model-archiver @ file:///serve/model-archiver
torch-workflow-archiver @ file:///serve/workflow-archiver
torchaudio @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchaudio-2.1.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=56b8ca0bd6b72edb55fbd06114d94e9d3f3c4daf8d456a0dc929d072105d75ee
torchdata @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchdata-0.7.0%2B7c7597b-cp310-cp310-linux_x86_64.whl#sha256=f33afa5a8f8f6979fb7f35cd53e61da1da81a6be852985c2bf6cb4c9bb7fed94
torchserve @ file:///serve
torchtext @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchtext-0.16.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=2fe28a2da3a194e553eeda3683955f4a0821ba732fe10bcb5895c3293525807f
torchvision @ https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.1.0/cuda11.8.0/torchvision-0.16.0%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=bf8bfb5351e8d02591bf8833ca6df7621f060d4cb238d6c32511579e53507acb

We can see that the torch binary installed in the base python environment already satisfies the dependency of segment-anything-py==1.0 on torch and is not reinstalled:

Requirement already satisfied: torch>=1.7 in /opt/conda/lib/python3.10/site-packages (from segment-anything-py==1.0->-r test_req.txt (line 1)) (2.1.0+cu118)

User Experience

Existing model archives with custom requirements.txt should not be affected and dependencies will be installed in the specific target model directory(same as the existing behavior). Note: the logging has been improved to show what packages were downloaded and installed, this is shown below.
To enable an existing model to use virtual environment, the only change required will be is to set useVenv: true in model-config.yaml. Logs for virtual environment creation and dependency installation is shown below.

With useVenv disabled in model-config.yaml:

2024-02-07T18:31:36,984 [INFO ] main org.pytorch.serve.wlm.ModelManager - Installed custom pip packages for model mnist:
Collecting matplotlib (from -r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached matplotlib-3.7.4-cp38-cp38-macosx_10_12_x86_64.whl.metadata (5.7 kB)
Collecting contourpy>=1.0.1 (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached contourpy-1.1.1-cp38-cp38-macosx_10_9_x86_64.whl.metadata (5.9 kB)
Collecting cycler>=0.10 (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached fonttools-4.48.1-cp38-cp38-macosx_10_9_x86_64.whl.metadata (158 kB)
Collecting kiwisolver>=1.0.1 (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached kiwisolver-1.4.5-cp38-cp38-macosx_10_9_x86_64.whl.metadata (6.4 kB)
Collecting numpy<2,>=1.20 (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached numpy-1.24.4-cp38-cp38-macosx_10_9_x86_64.whl.metadata (5.6 kB)
Collecting packaging>=20.0 (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached packaging-23.2-py3-none-any.whl.metadata (3.2 kB)
Collecting pillow>=6.2.0 (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached pillow-10.2.0-cp38-cp38-macosx_10_10_x86_64.whl.metadata (9.7 kB)
Collecting pyparsing>=2.3.1 (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached pyparsing-3.1.1-py3-none-any.whl.metadata (5.1 kB)
Collecting python-dateutil>=2.7 (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting importlib-resources>=3.2.0 (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached importlib_resources-6.1.1-py3-none-any.whl.metadata (4.1 kB)
Collecting zipp>=3.1.0 (from importlib-resources>=3.2.0->matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached zipp-3.17.0-py3-none-any.whl.metadata (3.7 kB)
Collecting six>=1.5 (from python-dateutil>=2.7->matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8bf761e997c948b7b4b3d2e8a3ed840c/requirements.txt (line 1))
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Using cached matplotlib-3.7.4-cp38-cp38-macosx_10_12_x86_64.whl (7.4 MB)
Using cached contourpy-1.1.1-cp38-cp38-macosx_10_9_x86_64.whl (247 kB)
Using cached cycler-0.12.1-py3-none-any.whl (8.3 kB)
Using cached fonttools-4.48.1-cp38-cp38-macosx_10_9_x86_64.whl (2.3 MB)
Using cached importlib_resources-6.1.1-py3-none-any.whl (33 kB)
Using cached kiwisolver-1.4.5-cp38-cp38-macosx_10_9_x86_64.whl (68 kB)
Using cached numpy-1.24.4-cp38-cp38-macosx_10_9_x86_64.whl (19.8 MB)
Using cached packaging-23.2-py3-none-any.whl (53 kB)
Using cached pillow-10.2.0-cp38-cp38-macosx_10_10_x86_64.whl (3.5 MB)
Using cached pyparsing-3.1.1-py3-none-any.whl (103 kB)
Using cached zipp-3.17.0-py3-none-any.whl (7.4 kB)
Installing collected packages: zipp, six, pyparsing, pillow, packaging, numpy, kiwisolver, fonttools, cycler, python-dateutil, importlib-resources, contourpy, matplotlib
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
sagemaker 2.143.0 requires protobuf<4.0,>=3.1, but you have protobuf 4.25.1 which is incompatible.
sagemaker 2.143.0 requires PyYAML==5.4.1, but you have pyyaml 6.0 which is incompatible.
Successfully installed contourpy-1.1.1 cycler-0.12.1 fonttools-4.48.1 importlib-resources-6.1.1 kiwisolver-1.4.5 matplotlib-3.7.4 numpy-1.24.4 packaging-23.2 pillow-10.2.0 pyparsing-3.1.1 python-dateutil-2.8.2 six-1.16.0 zipp-3.17.0

All custom packages and their dependencies are installed to the target directory.

With useVenv enabled in model-config.yaml:

2024-02-07T18:28:55,027 [INFO ] main org.pytorch.serve.wlm.ModelManager - Created virtual environment for model mnist: /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/venv
2024-02-07T18:28:57,880 [INFO ] main org.pytorch.serve.wlm.ModelManager - Installed custom pip packages for model mnist:
Requirement already satisfied: matplotlib in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from -r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (3.7.4)
Requirement already satisfied: cycler>=0.10 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (0.11.0)
Requirement already satisfied: contourpy>=1.0.1 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (1.1.1)
Requirement already satisfied: numpy<2,>=1.20 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (1.24.3)
Requirement already satisfied: packaging>=20.0 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (23.2)
Requirement already satisfied: pyparsing>=2.3.1 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (3.0.9)
Requirement already satisfied: kiwisolver>=1.0.1 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (1.4.4)
Requirement already satisfied: pillow>=6.2.0 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (10.2.0)
Requirement already satisfied: python-dateutil>=2.7 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (2.8.2)
Requirement already satisfied: fonttools>=4.22.0 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (4.34.4)
Requirement already satisfied: importlib-resources>=3.2.0 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (5.12.0)
Requirement already satisfied: zipp>=3.1.0 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from importlib-resources>=3.2.0->matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (3.8.1)
Requirement already satisfied: six>=1.5 in /Users/namannan/.pyenv/versions/3.8.13/lib/python3.8/site-packages (from python-dateutil>=2.7->matplotlib->-r /var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/requirements.txt (line 1)) (1.16.0)
WARNING: You are using pip version 22.0.4; however, version 24.0 is available.
You should consider upgrading via the '/var/folders/l4/hfy8rtpx0755sys3m8m8c48w0000gs/T/models/8d0d560e458a4dd0bfb4763588a2416b/venv/bin/python -m pip install --upgrade pip' command.

Packages already available in the base python environment are not downloaded and reinstalled.

Type of change

New feature
This change requires a documentation update

Feature/Issue validation/testing

CI
- serve/frontend/server/src/test/java/org/pytorch/serve/ModelServerTest.java
  
  Line 1143 in 4b69459
  
  public void testModelWithCustomPythonDependency()
- serve/frontend/server/src/test/java/org/pytorch/serve/ModelServerTest.java
  
  Line 1153 in 4b69459
  
  public void testModelWithInvalidCustomPythonDependency()
- serve/frontend/server/src/test/java/org/pytorch/serve/WorkflowTest.java
  
  Line 400 in 4b69459
  
  public void testWorkflowWithCustomPythonDependencyModel()
- serve/frontend/server/src/test/java/org/pytorch/serve/WorkflowTest.java
  
  Line 413 in 4b69459
  
  public void testWorkflowWithInvalidCustomPythonDependencyModel()
Manual testing using Sagemaker MME example here: Refactor reqirements handling and model archive creation for TS MME example aws/amazon-sagemaker-examples#4517

msaroufim

Some minor comments on the python command you're constructing, I did not review the new path manipulation utils

msaroufim · 2024-01-30T01:21:31Z

frontend/server/src/main/java/org/pytorch/serve/wlm/ModelManager.java

+        List<String> commandParts = new ArrayList<>();
+        commandParts.add(EnvironmentUtils.getPythonRunTime(model));
+        commandParts.add("-m");
+        commandParts.add("venv");


should you make this name customizable? Part of the appeal of this PR is different workers should be able to have different virtual environments

Currently this PR creates a virtual environment on a per model basis at model load time. All workers for a given model use the same virtual environment. This replaces installing dependencies on a per model basis in a target directory and is backwards compatible with the existing behavior with no change to customer experience. Although the same name venv is used, they are located within the individual model directories, for ex: /tmp/models/test-model/venv.
Would it be useful to extend this implementation to support separate venv per worker?

No the isolation is fine as is I think

msaroufim · 2024-01-30T01:23:02Z

frontend/server/src/main/java/org/pytorch/serve/wlm/ModelManager.java

+        commandParts.add(EnvironmentUtils.getPythonRunTime(model));
+        commandParts.add("-m");
+        commandParts.add("venv");
+        commandParts.add("--clear");


why clear? Seems beneficial to allow users to install their dependencies beforehand

It was always quite weird how we pip installed a bunch of stuff on launching a model

Sounds good, will check if venvs can safely be made portable, since the symlinks in the bin directory of the venv, for ex: venv/bin/python may need to be updated for them to work, since the python binary path may not be the same on the host on which the venv is created and the host on which the venv is used.

From the official docs: https://docs.python.org/3/library/venv.html
Warning: Because scripts installed in environments should not expect the environment to be activated, their shebang lines contain the absolute paths to their environment’s interpreters. Because of this, environments are inherently non-portable, in the general case. You should always have a simple means of recreating an environment (for example, if you have a requirements file requirements.txt, you can invoke pip install -r requirements.txt using the environment’s pip to install all of the packages needed by the environment).

msaroufim · 2024-01-30T01:24:03Z

frontend/server/src/main/java/org/pytorch/serve/wlm/ModelManager.java

+        commandParts.add("-m");
+        commandParts.add("venv");
+        commandParts.add("--clear");
+        commandParts.add("--system-site-packages");


why system site?

This is to make sure that the venv can see packages that are already installed in the base python environment and not have to install them again. If a newer version of an existing package is required or a non existing package is required, they will get installed to the venv site-packages and will take precedence over system-site-packages. This does not affect the base python environment. I've added more details with an example in the PR description: #2910 (comment)

msaroufim · 2024-01-30T01:26:20Z

frontend/server/src/main/java/org/pytorch/serve/wlm/ModelManager.java

-            commandParts.add("-t");
-            commandParts.add(dependencyPath.getAbsolutePath());
+            commandParts.add("--upgrade-strategy");
+            commandParts.add("only-if-needed");


I think this is a good change but do you mind just telling me why you needed to make it?

In prior versions of pip (i.e pip<10.0) when pip install -U -r requirements.txt is used, all the packages listed in requirements.txt and their dependencies are upgraded since by default the --upgrade-strategy was eager. --upgrade-strategy applies to the handling of dependencies of the packages specified in requirements.txt. In pip>=10.0, the default --upgrade-stragegy is only-if-needed. This change is to explicitly make the --upgrade-strategy as only-if-needed irrespective of pip version.

From the pip docs: https://pip.pypa.io/en/stable/user_guide/#only-if-needed-recursive-upgrade

lxning

it is expensive to create python venv for each model and potentially increases the model loading latency. Do you know what the root cause that the existing pip install dependency installs entire pytorch again? Can we have a lightweight solution?

agunapal · 2024-01-30T17:58:06Z

@namannandan I like the overall idea. A couple of questions

Is this breaking the previous behavior? Meaning, we should still support the existing behavior
Does this support a single venv for all models if the customer wants this.

namannandan · 2024-01-30T18:53:35Z

it is expensive to create python venv for each model and potentially increases the model loading latency. Do you know what the root cause that the existing pip install dependency installs entire pytorch again? Can we have a lightweight solution?

Agreed, creation of virtual environment adds latency to the model load.
I will measure the latency overhead and add more details on this PR.

Root cause of existing pip install dependency installs entire pytorch again is as follows:
The command used to install dependencies is: python -m pip install -U -t <target-dir> -r requirements.txt.

This command will install all the packages listed in requirements.txt and all their dependencies to the target directory.

For ex: if we have say, segment-anything-py==1.0 in requirements.txt, segment-anything-py==1.0 requires torch>=1.7. Even if torch-2.1.0 is already installed, pip will ignore it and go ahead and install the latest supported version torch-2.1.2 in the target directory.

This is expected behavior of pip when using the -t flag since it will only check the target-dir for packages and not site-packages. Reference: https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-t

Can we have a lightweight solution?
One potential option is to use the --no-deps flag(Reference: https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-no-deps) when installing dependencies as follows:
python -m pip install -U --no-deps -t <target-dir> -r requirements.txt
This way, only the packages in requirements.txt will be installed and none of their dependencies will be installed.

Pros:

Avoids re-installation of dependencies

Cons:

Could break backwards compatibility since this will expect requirements.txt to specify packages and all their dependencies to be specified.

@lxning, @msaroufim, @agunapal what are your thoughts on the above approach?

namannandan · 2024-01-30T19:05:36Z

@namannandan I like the overall idea. A couple of questions

1. Is this breaking the previous behavior? Meaning, we should still support the existing behavior

2. Does this support a single venv for all models if the customer wants this.

This does not break existing behavior and is backwards compatible with model archives that have already been created.
Currently no, a separate venv is created per model that has a requirements.txt file associated with it. This is done in a manner that does not change the existing customer experience of including a requirements.txt file along with a model archive.

namannandan · 2024-01-31T18:42:00Z

Latency and storage analysis

Instance type: g5.2xlarge
OS: Ubuntu 20.04.6 LTS
Python: 3.10.9

Creation of empty virtual environment:

$ time python -m venv --clear --system-site-packages ./test

real    0m2.606s
user    0m2.407s
sys     0m0.180s

$ du -sh test
22M     test

Using the lama model as example from here: https://github.com/aws/amazon-sagemaker-examples/tree/main/inference/torchserve/mme-gpu

Installing custom dependencies to target directory (Torchserve v0.9.0)

$ time curl -X POST "http://127.0.0.1:8081/models?url=lama"
{
  "status": "Model \"lama\" Version: 1.0 registered with 1 initial workers"
}

real    2m28.987s
user    0m0.011s
sys     0m0.000s

$ du -sh /tmp/models/c3f1def4c8bb4cd8bddd2023dffcaa95/lama/
402M    /tmp/models/c3f1def4c8bb4cd8bddd2023dffcaa95/lama/

$ du -sh /tmp/models/c3f1def4c8bb4cd8bddd2023dffcaa95/
6.6G    /tmp/models/c3f1def4c8bb4cd8bddd2023dffcaa95/

Installing custom dependencies using a virtual environment (using the implementation in this PR)

$ time curl -X POST "http://127.0.0.1:8081/models?url=lama"
{
  "status": "Model \"lama\" Version: 1.0 registered with 1 initial workers"
}

real    1m2.334s
user    0m0.000s
sys     0m0.009s

$ du -sh /tmp/models/9cbcb78b2f07441688d9056a2950e314/lama/
402M    /tmp/models/9cbcb78b2f07441688d9056a2950e314/lama/

$ du -sh /tmp/models/9cbcb78b2f07441688d9056a2950e314/
2.1G    /tmp/models/9cbcb78b2f07441688d9056a2950e314/

Summary

Creating a virtual environment adds latency overhead of around 2.6 seconds and consumes 25M space on disk.
Although the virtual environment adds a latency overhead, in a practical use case of loading a model with custom dependencies, it can be faster since it enables reusing existing packages and not have to download and reinstall them. In the above example 1m2.334s with virtual environment as compared to 2m28.987s when installing dependencies to a target directory.
Although the virtual environment has a disk space overhead, in a practical use case of loading a model with custom dependencies, it can save space since it enables re-using existing packages and not have to replicate them. In the above example 2.1G with virtual environment as opposed to 6.6G when installing dependencies to a target directory.

lxning · 2024-01-31T19:02:45Z

@namannandan thanks for the analysis, let's add this python venv option in model level config (ie. model-config.yaml).

This reverts commit 6c4a279.

agunapal · 2024-02-06T01:52:19Z

Hi @namannandan Can you please update the PR description with how the user experience is going to be with/without venv and show any relevant logs which indicates that its working as expected and there is no BC issue

frontend/server/src/main/java/org/pytorch/serve/util/messages/EnvironmentUtils.java

frontend/server/src/main/java/org/pytorch/serve/wlm/ModelManager.java

test/pytest/test_model_custom_dependencies.py

namannandan · 2024-02-08T02:48:07Z

@namannandan thanks for the analysis, let's add this python venv option in model level config (ie. model-config.yaml).

Updated implementation to support useVenv as a model level config.

namannandan · 2024-02-08T02:49:26Z

Hi @namannandan Can you please update the PR description with how the user experience is going to be with/without venv and show any relevant logs which indicates that its working as expected and there is no BC issue

Included a user experience section in the summary with logs: #2910 (comment)

Also included details about useVenv in Readme: https://github.com/pytorch/serve/pull/2910/files#diff-439e668fd67373383bd1cc408a01f95b5d5e9ac65eb7f9e756bc04ab23e8f257R177

frontend/server/src/main/java/org/pytorch/serve/wlm/ModelManager.java

agunapal

LGTM

Review comments have been addressed.

namannandan added 4 commits January 22, 2024 17:11

Explicitly set default dependency upgrade strategy to only-if-needed

50da539

Add support for venv creation per model

254518f

Create virtual env only when requirements need to be installed

999782e

Merge branch 'master' into fix-requirements-upgrade

51b1135

namannandan changed the title ~~Refactor custom package installation using virtual environment~~ Refactor model custom package installation to virtual environment Jan 29, 2024

namannandan changed the title ~~Refactor model custom package installation to virtual environment~~ Refactor model custom package installation to use virtual environment Jan 29, 2024

Format logger and exception messages

e1c0734

namannandan requested review from lxning, mreso, msaroufim and agunapal January 29, 2024 18:44

namannandan marked this pull request as ready for review January 29, 2024 18:44

namannandan changed the title ~~Refactor model custom package installation to use virtual environment~~ Refactor model custom dependency installation to use virtual environment Jan 29, 2024

msaroufim previously requested changes Jan 30, 2024

View reviewed changes

lxning reviewed Jan 30, 2024

View reviewed changes

namannandan added 2 commits February 2, 2024 12:24

Enable per model useVenv configuration option

d7e72c8

Merge branch 'master' into fix-requirements-upgrade

406304f

msaroufim self-requested a review February 3, 2024 01:55

namannandan added 4 commits February 4, 2024 22:09

Add integration tests and documentation

5715cf2

Fix integraiton test failures

6c4a279

Revert "Fix integraiton test failures"

5101d6b

This reverts commit 6c4a279.

Update integration test teardown functionality

62a82cf

lxning reviewed Feb 6, 2024

View reviewed changes

Refactor implementaiton to check for useVenv in Model.java

750f2e0

namannandan force-pushed the fix-requirements-upgrade branch from 92908fd to 750f2e0 Compare February 7, 2024 23:24

namannandan added 4 commits February 7, 2024 15:24

Merge branch 'master' into fix-requirements-upgrade

df6b70d

Fix dependencyPath logic

760fac9

Refactor isUseVenv and integration tests

603f440

Update documentation

4f470f5

agunapal reviewed Feb 8, 2024

View reviewed changes

frontend/server/src/main/java/org/pytorch/serve/wlm/ModelManager.java Show resolved Hide resolved

Merge branch 'master' into fix-requirements-upgrade

0e2fb2a

namannandan requested review from lxning and agunapal February 8, 2024 19:40

namannandan changed the title ~~Refactor model custom dependency installation to use virtual environment~~ Enable model custom dependency installation to use virtual environment Feb 8, 2024

namannandan changed the title ~~Enable model custom dependency installation to use virtual environment~~ Enable model custom dependency installation using virtual environment Feb 8, 2024

agunapal approved these changes Feb 9, 2024

View reviewed changes

Merge branch 'master' into fix-requirements-upgrade

c1422af

namannandan enabled auto-merge February 9, 2024 20:05

namannandan added this pull request to the merge queue Feb 9, 2024

Merged via the queue into master with commit ba8f96a Feb 9, 2024
15 checks passed

namannandan deleted the fix-requirements-upgrade branch February 14, 2024 08:47

namannandan mentioned this pull request Feb 14, 2024

Enable venv to inherit site packages from base python environment #2946

Merged

5 tasks

chauhang added this to the v0.10.0 milestone Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable model custom dependency installation using virtual environment #2910

Enable model custom dependency installation using virtual environment #2910

namannandan commented Jan 26, 2024 •

edited

Loading

msaroufim left a comment

msaroufim Jan 30, 2024

namannandan Jan 30, 2024 •

edited

Loading

msaroufim Jan 31, 2024

msaroufim Jan 30, 2024

namannandan Jan 30, 2024 •

edited

Loading

msaroufim Jan 30, 2024

namannandan Jan 30, 2024

msaroufim Jan 30, 2024

namannandan Jan 30, 2024

lxning left a comment

agunapal commented Jan 30, 2024

namannandan commented Jan 30, 2024

namannandan commented Jan 30, 2024

namannandan commented Jan 31, 2024

lxning commented Jan 31, 2024

agunapal commented Feb 6, 2024

namannandan commented Feb 8, 2024

namannandan commented Feb 8, 2024 •

edited

Loading

agunapal left a comment

	commandParts.add(pythonRuntime);
	commandParts.add("-m");
	commandParts.add("pip");
	commandParts.add("install");
	commandParts.add("-U");
	commandParts.add("-t");
	commandParts.add(dependencyPath.getAbsolutePath());
	commandParts.add("-r");
	commandParts.add(requirementsFilePath.toString());

	File dependencyPath = new File(modelPath);
	if (Files.isSymbolicLink(dependencyPath.toPath())) {
	pythonPath
	.append(dependencyPath.getParentFile().getAbsolutePath())
	.append(File.pathSeparatorChar);
	}
	}

Enable model custom dependency installation using virtual environment #2910

Enable model custom dependency installation using virtual environment #2910

Conversation

namannandan commented Jan 26, 2024 • edited Loading

Description

User Experience

Type of change

Feature/Issue validation/testing

msaroufim left a comment

Choose a reason for hiding this comment

msaroufim Jan 30, 2024

Choose a reason for hiding this comment

namannandan Jan 30, 2024 • edited Loading

Choose a reason for hiding this comment

msaroufim Jan 31, 2024

Choose a reason for hiding this comment

msaroufim Jan 30, 2024

Choose a reason for hiding this comment

namannandan Jan 30, 2024 • edited Loading

Choose a reason for hiding this comment

msaroufim Jan 30, 2024

Choose a reason for hiding this comment

namannandan Jan 30, 2024

Choose a reason for hiding this comment

msaroufim Jan 30, 2024

Choose a reason for hiding this comment

namannandan Jan 30, 2024

Choose a reason for hiding this comment

lxning left a comment

Choose a reason for hiding this comment

agunapal commented Jan 30, 2024

namannandan commented Jan 30, 2024

namannandan commented Jan 30, 2024

namannandan commented Jan 31, 2024

Latency and storage analysis

lxning commented Jan 31, 2024

agunapal commented Feb 6, 2024

namannandan commented Feb 8, 2024

namannandan commented Feb 8, 2024 • edited Loading

agunapal left a comment

Choose a reason for hiding this comment

namannandan commented Jan 26, 2024 •

edited

Loading

namannandan Jan 30, 2024 •

edited

Loading

namannandan Jan 30, 2024 •

edited

Loading

namannandan commented Feb 8, 2024 •

edited

Loading