Notebooks on how to Accelerate Large AI model inference

This repository was created to add code examples to be used alongside the workshop "Accelerating Large Models In Production: A Practical Guide" presented at Toronto Machine Learning Summit 2023 by James Cameron. Slides and notes can be found here

Disclaimer

This repository is meant to serve as a starting point for new projects. Always to be used only as a reference.

System Requirements

4 core CPU (min 3.0 GHz)
8 GB System Memory
Nvidia GPU w/ CUDA Compute Capability >=7.0
6GB Memory minimum

Installation

Setting up the environment

We need to install docker and docker-compose. Please follow the instructions from here (Docker) and here (Docker Compose).

Next we need to install the Nvidia Container Toolkit. Please follow the instructions from here.

Set the nvidia Docker runtime as default:

sudo tee /etc/docker/daemon.json <<EOF
{
  "default-runtime": "nvidia",
  "runtimes": {
    "nvidia": {
      "path": "/usr/bin/nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}
EOF

Restart Docker:

sudo service docker restart

Then test the installation:

docker run nvcr.io/nvidia/cuda nvidia-smi

You should see a similar output:

Tue Mar 29 12:38:23 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.81       Driver Version: 472.39       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A3000    Off  | 00000000:01:00.0 Off |                  N/A |
| 25%   46C    P8    12W / 100W |    319MiB /  6144MiB |    18%       Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Running the project

To start the services, run the following command:

bash run.sh

Accessing the Notebooks

Navigate to http://localhost:8888/lab

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
notebooks		notebooks
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Notebooks on how to Accelerate Large AI model inference

Disclaimer

System Requirements

Installation

Setting up the environment

Running the project

Accessing the Notebooks

About

Releases

Packages

Languages

License

FamousDirector/AcceleratingLargeModelNotebooksForTMLS2023

Folders and files

Latest commit

History

Repository files navigation

Notebooks on how to Accelerate Large AI model inference

Disclaimer

System Requirements

Installation

Setting up the environment

Running the project

Accessing the Notebooks

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages