This repository is the official implementation of NITRO-D: Native Integer-only Training of Deep Convolutional Neural Networks.
NITRO-D is a novel training framework for arbitrarily deep integer-only Convolutional Neural Networks (CNNs) that operates entirely in the integer-only domain for both training and inference. NITRO-D utilizes a unique learning algorithm derived from Local Error Signals (LES) and it represents the first work in the literature to enable the training of integer-only deep CNNs without the need to introduce a quantization scheme.
NITRO-D introduces both a novel architecture and a novel integer-only learning algorithm designed to train this architecture exploiting IntegerSGD, an optimizer designed specifically to operate in an integer-only context. Experimental evaluations demonstrate its effectiveness across several state-of-the-art image recognition datasets, highlighting considerable performance improvements from 2.47% to 5.96% over the state-of-the-art.
BibTeX entry:
@misc{pirillo2024nitrodnativeintegeronlytraining,
title={NITRO-D: Native Integer-only Training of Deep Convolutional Neural Networks},
author={Alberto Pirillo and Luca Colombo and Manuel Roveri},
year={2024},
eprint={2407.11698},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2407.11698},
}
NITRO-D was developed and tested using Python 3.11 and CUDA 12.1.
Install the required packages using the following commands:
python -m pip install -U setuptools pip
pip install -r requirements.txt
(optional) Install cuTensor and cuDNN for optimal performance:
python -m cupyx.tools.install_library --cuda 12.x --library cutensor
python -m cupyx.tools.install_library --cuda 12.x --library cudnn
(optional) Enable the installed accelerators by setting environment variables. On Linux:
echo 'export CUPY_ACCELERATORS=cutensor,cub' >> ~/.bashrc
We provide a separate notebook to reproduce the results for all the configurations considered in the paper.
These notebooks are called train.ipynb
and are located in the results
directory.
Each notebook:
- Imports the required libraries
- Defines the experimental setup
- Loads and pre-processes the dataset
- Instantiates and trains the model
N_EXPERIMENTS
times with different initializations - Saves the results of each run in a CSV file
- Saves the model of the first run in the
model.pkl
file - Computes the average and standard deviation of the train and test accuracy
We also provide two notebooks in the examples
directory, nitro_cnn.ipynb
and nitro_mlp.ipynb
, that detail the training and evaluation of NITRO-D models.
The models are automatically evaluated inside of the train.ipnyb
notebooks. We also provide an additional eval.ipynb
notebook in the examples
directory, which shows how to load a pre-trained model and evaluate it on a dataset.
NITRO-D models were trained and evaluated on three different datasets: MNIST, FashionMNIST, and CIFAR-10. The results are summarized in the table below, which reports the average test accuracy and the standard deviation over 10 runs.
Model name | Dataset | NITRO-D | PocketNN | FP LES | FP BP |
---|---|---|---|---|---|
MLP [784-100-50-10] | MNIST | - | |||
MLP [784-200-100-50-10] | FashionMNIST | - | |||
MLP [1024-3000-3000-3000-10] | CIFAR-10 | - | |||
VGG8B | MNIST | - | |||
VGG8B | FashionMNIST | - | |||
VGG8B | CIFAR-10 | - | |||
VGG11B | CIFAR-10 | - |
NITRO-D is licensed under CC BY-NC-SA 4.0