Implementation of protocols from the paper SIGMA.
Warning: This is an academic proof-of-concept prototype and has not received careful code review. This implementation is NOT ready for production use.
This project requires NVIDIA GPUs, and assumes that GPU drivers and the NVIDIA CUDA Toolkit are already installed. The following has been tested on Ubuntu 20.04 with CUDA 11.7, CMake 3.27.2 and g++-9.
Please note that Sytorch requires CMake version >= 3.17 and the build will fail if this depency is not met.
The code uses CUTLASS version 2.11 by default, so if you change the CUDA version, please make sure that the CUTLASS version being built is compatible with the new CUDA version.
The last line of setup.sh
tries to install matplotlib
, which is needed for generating Figure 10. In our experience, the installation fails if the versions of Python and pip
do not match. In case the installation fails, please install matplotlib
manually before running run_experiment.py
.
- Export environment variables
export CUDA_VERSION=11.7
export GPU_ARCH=86
- Set up the environment
sh setup.sh
To change the version of CUTLASS being built, optionally include the CUTLASS branch that should be built as
sh setup.sh <CUTLASS branch>
For example, to build the main branch, run
sh setup.sh main
- Make SIGMA
make sigma
- Switch to the
experiments/sigma
directory
cd experiments/sigma
-
Since FSS generates large keys, writing keys to disk and reading keys from disk can take a long time. To ensure that the artifact runs in a reasonable amount of time, we avoid going to disk and instead have the dealer generate keys in CPU memory. These keys are then used by the evaluator. Please make sure that the CPU memory is large enough to support the key size of the model being run. Key sizes can be estimated from Table 9 of the paper.
-
Currently, we only support sequence lengths that are powers-of-2.
Make produces the sigma
executable which is in experiments/sigma
. Each party (the server and the client) needs to run this executable. The executable requires the user to specify the model, sequence length, party number (0 for the server/1 for the client), the IP address of the other party, and the number of CPU threads to use for computation.
The syntax is
./sigma <model name> <sequence length> <party=0/1 (server/client)> <peer IP> <CPU threads>
We currently support the following models: bert-tiny, bert-base, bert-large, gpt2, llama-7b, llama-13b
.
Example: To run GPT2, the server will run:
./sigma gpt2 128 0 <client IP> 64
The client will run (on a different machine):
./sigma gpt2 128 1 <server IP> 64
Results are stored in the output/P<party number>/models/<model name>-<sequence length>/
folder.
Before the artifact can be run, we need to configure it via config.json
.
For the server(=P0), config.json
looks like:
{
"P0": {
"gpu": <The ID of the GPU to use>,
"peer": <The IP address of the remote peer>,
"cpu_threads": <The number of CPU threads to use for computation>
}
}
For the client(=P1), config.json
looks exactly the same, only the arguments are specified under the key "P1".
A sample config.json
file can be found in the experiments/sigma
folder.
Once config.json
has been filled, the script run_experiment.py
can be used to reproduce the tables and figures in the paper. Here are the relevant options:
usage: python run_experiment.py [-h] [--perf true] [--n_seq true] [--all true] --party 0/1
optional arguments:
--perf true Generate Tables 3, 5, 9, and Figure 10.
--n_seq true Generate Table 8.
--all true Run all the experiments.
Table 7 can be reproduced by throttling the network bandwidth (with tc
, for example) and re-running python run_experiment.py --perf true
to generate Table 5.
Results are stored in output/P<party-number>/Table<table-number>.json
or output/P<party-number>/Fig<figure-number>.png
.
Log files (which might help with debugging) can be found in the output/P<party number>/models/<model name>-<sequence length>/logs.txt
file.
You can cite the paper using the following BibTeX entry:
@inproceedings{sigma,
author = {Kanav Gupta and Neha Jawalkar and Ananta Mukherjee and Nishanth Chandran and Divya Gupta and Ashish Panwar and Rahul Sharma},
year = {2024},
title = {SIGMA: Secure GPT Inference with Function Secret Sharing},
booktitle = {Proc. Priv. Enhancing Technol.}
}