A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
Paper | Tags | Code | Years |
---|---|---|---|
A Novel In-DRAM Accelerator Architecture for Binary Neural Network | Hardware | -- | 2020 |
An Energy-Efficient and High Throughput in-Memory Computing Bit-Cell With Excellent Robustness Under Process Variations for Binary Neural Network | Hardware | -- | 2020 |
BNN Pruning: Pruning Binary Neural Network Guided by Weight Flipping Frequency | Binarization | Link | 2020 |
Compressing deep neural networks on FPGAs to binary and ternary precision with hls4ml | Hardware | -- | 2020 |
End-to-end Learned Image Compression with Fixed Point Weight Quantization | Low-bit Quantization | -- | 2020 |
Low-bit Quantization Needs Good Distribution | Low-bit Quantization | -- | 2020 |
SIMBA: A Skyrmionic In-Memory Binary Neural Network Accelerator | Hardware | 2020 | |
Training Binary Neural Networks with Real-to-Binary Convolutions | Binarization | Link | 2020 |
Training with Quantization Noise for Extreme Model Compression | Low-bit Quantization | Link | 2020 |
Phoenix: A Low-Precision Floating-Point Quantization Oriented Architecture for Convolutional Neural Networks | Low-bit Quantization | -- | 2020 |
Towards Lossless Binary Convolutional Neural Networks Using Piecewise Approximation | Binarization | Not yet | 2020 |
IMAC: In-Memory Multi-Bit Multiplication and ACcumulation in 6T SRAM Array | Hardware | -- | 2020 |
Understanding Learning Dynamics of Binary Neural Networks via Information Bottleneck | Binarization | -- | 2020 |
Training high-performance and large-scale deep neural networks with full 8-bit integers | Low-bit Quantization | -- | 2020 |
MoBiNet: A Mobile Binary Network for Image Classification | Binarization | -- | 2020 |
Controlling information capacity of binary neural network | Binarization | -- | 2020 |
BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations | Binarization | Link | 2020 |
Binary Neural Networks: A Survey | Binarization | -- | 2020 |
An Energy-Efficient Bagged Binary Neural Network Accelerator | Hardware; Binarization | -- | 2020 |
Forward and Backward Information Retention for Accurate Binary Neural Networks | Binarization | Link | 2020 |
MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy? | Binarization | Link | 2020 |
Design of High Robustness BNN Inference Accelerator Based on Binary Memristors | Hardware | -- | 2020 |
RPR: Random Partition Relaxation for Training Binary and Ternary Weight Neural Networks | Binarization; Low-bit Quantization | -- | 2020 |
OrthrusPE: Runtime Reconfigurable Processing Elements for Binary Neural Networks | Hardware | -- | 2020 |
Distillation Guided Residual Learning for Binary Convolutional Neural Networks | Binarization | -- | 2020 |
A Resource-Efficient Inference Accelerator for Binary Convolutional Neural Networks | Hardware | -- | 2020 |
How Does Batch Normalization Help Binary Training? | Binarization | -- | 2020 |
Paper | Tags | Code | Years |
---|---|---|---|
Product Engine for Energy-Efficient Execution of Binary Neural Networks Using Resistive Memories | Hardware, Binarization | -- | 2019 |
A Systematic Study of Binary Neural Networks' Optimisation | Binarization | -- | 2019 |
Accurate and Compact Convolutional Neural Networks with Trained Binarization | Binarization | -- | 2019 |
Balanced Circulant Binary Convolutional Networks | Binarization | -- | 2019 |
Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit? | Binarization | -- | 2019 |
BNN+: Improved Binary Network Training | Binarization | -- | 2019 |
Circulant Binary Convolutional Networks: Enhancing the Performance of 1-bit DCNNs with Circulant Back Propagation | Binarization | -- | 2019 |
daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices | Hardware, Binarization | Link | 2019 |
Deep Binary Reconstruction for Cross-Modal Hashing | Binarization | -- | 2019 |
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks | Low-bit Quantization | -- | 2019 |
Dual Path Binary Neural Network | Binarization | -- | 2019 |
Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices | Hardware | -- | 2019 |
Fully Quantized Network for Object Detection | Low-bit Quantization | -- | 2019 |
Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine | Hardware | -- | 2019 |
Improved training of binary networks for human pose estimation and image recognition | Binarization | -- | 2019 |
Learning Channel-wise Interactions for Binary Convolutional Neural Networks | Binarization | -- | 2019 |
MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization | Low-bit Quantization | Link | 2019 |
Proxquant: Quantized neural networks via proximal operators | Low-bit Quantization, Binarization | Link | 2019 |
PXNOR: Perturbative Binary Neural Network | Binarization | Link | 2019 |
Quantization Networks | Low-bit Quantization | Link | 2019 |
Recursive Binary Neural Network Training Model for Efficient Usage of On-Chip Memory | Binarization | -- | 2019 |
SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity through Low-Bit Quantization | Low-bit Quantization | -- | 2019 |
Self-Binarizing Networks | Binarization | -- | 2019 |
Towards Unified INT8 Training for Convolutional Neural Network | Low-bit Quantization | -- | 2019 |
Training Accurate Binary Neural Networks from Scratch | Binarization | Link | 2019 |
Using Neuroevolved Binary Neural Networks to solve reinforcement learning environments | Binarization | Link | 2019 |
Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays | Hardware | -- | 2019 |
XNOR-Net++: Improved binary neural networks | Binarization | -- | 2019 |
An Energy-Efficient Reconfigurable Processor for Binary-and Ternary-Weight Neural Networks With Flexible Data Bit Width | Binarization, Low-bit Quantization | -- | 2019 |
Paper | Tags | Code | Years |
---|---|---|---|
Two-Step Quantization for Low-bit Neural Networks | Low-bit Quantization | -- | 2018 |
Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM | Low-bit Quantization | Link | 2018 |
PACT: PARAMETERIZED CLIPPING ACTIVATION FOR QUANTIZED NEURAL NETWORKS | Low-bit Quantization | -- | 2018 |
Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA | Hardware | -- | 2018 |
A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks | Binarization | -- | 2018 |
A Survey of FPGA-based Accelerators for Convolutional Neural Networks | Hardware | -- | 2018 |
An Energy-Efficient Architecture for Binary Weight Convolutional Neural Networks | Binarization | -- | 2018 |
Analysis and Implementation of Simple Dynamic Binary Neural Networks | Binarization | -- | 2018 |
Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy | Low-bit Quantization | -- | 2018 |
BitFlow: Exploiting Vector Parallelism for Binary Neural Networks on CPU | Binarization | -- | 2018 |
BitStream: Efficient Computing Architecture for Real-Time Low-Power Inference of Binary Neural Networks on CPUs | Binarization, Hardware | -- | 2018 |
Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks | Low-bit Quantization, Binarization | -- | 2018 |
BRein Memory: A Single-Chip Binary/Ternary Reconfigurable in-Memory Deep Neural Network Accelerator Achieving 1.4 TOPS at 0.6 W | Hardware | -- | 2018 |
FBNA: A Fully Binarized Neural Network Accelerator | Hardware | -- | 2018 |
FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks | Hardware | -- | 2018 |
Loss-aware Binarization of Deep Networks | Binarization | -- | 2018 |
ReBNet: Residual Binarized Neural Network | Binarization | Link | 2018 |
Model compression via distillation and quantization | Low-bit Quantization | Link | 2018 |
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference | Low-bit Quantization | -- | 2018 |
Stochastic weights binary neural networks on FPGA | Binarization | -- | 2018 |
Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation | Binarization | -- | 2018 |
SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks | Low-bit Quantization | Link | 2018 |
Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA | Binarization, Hardware | -- | 2018 |
Training Binary Weight Networks via Semi-Binary Decomposition | Binarization | -- | 2018 |
Training Competitive Binary Neural Networks from Scratch | Binarization | Link | 2018 |
XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference | Hardware | -- | 2018 |
Paper | Tags | Code | Years |
---|---|---|---|
Ternary Neural Networks with Fine-Grained Quantization | Low-bit Quantization | -- | 2017 |
ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks | Low-bit Quantization | Link | 2017 |
Towards Accurate Binary Convolutional Neural Network | Binarization | Link | 2017 |
Deep Learning with Low Precision by Half-wave Gaussian Quantization | Low-bit Quantization | Link | 2017 |
Performance Guaranteed Network Acceleration via High-Order Residual Quantization | Low-bit Quantization | -- | 2017 |
From Hashing to CNNs: Training Binary Weight Networks via Hashing | Binarization | -- | 2017 |
INCREMENTAL NETWORK QUANTIZATION: TOWARDS LOSSLESS CNNS WITH LOW-PRECISION WEIGHTS | Low-bit Quantization | Link | 2017 |
Trained Ternary Quantization | Low-bit Quantization | Link | 2017 |
On-chip Memory Based Binarized Convolutional Deep Neural Network Applying Batch Normalization Free Technique on an FPGA | Hardware | -- | 2017 |
FP-BNN- Binarized neural network on FPGA | Hardware | -- | 2017 |
WRPN: Wide Reduced-Precision Networks | Low-bit Quantization | -- | 2017 |
Deep Learning Binary Neural Network on an FPGA | Hardware, Binarization | -- | 2017 |
A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks | Hardware, Binarization | -- | 2017 |
Paper | Tags | Code | Years |
---|---|---|---|
Ternary weight networks | Low-bit Quantization | Link | 2016 |
DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients | Low-bit Quantization | Link | 2016 |
XNOR-Net- ImageNet Classification Using Binary Convolutional Neural Networks | Binarization | Link | 2016 |
Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 | Binarization | Link | 2016 |
BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 | Binarization | Link | 2016 |
Paper | Tags | Code | Years |
---|---|---|---|
Bitwise Neural Networks | Binarization | -- | 2015 |
BinaryConnect- Training Deep Neural Networks with binary weights during propagations | Binarization | Link | 2015 |
Code | From | Description |
---|---|---|
PyTorch-Quant.py | https://github.com/Ewenwan/pytorch-playground/blob/master/utee/quant.py | Different quantization methods implement by Pytorch. |
ZF-Net | https://support.alpha-data.com/pub/appnotes/cnn/ | An Open Source FPGA CNN Library |
Doc | Description |
---|---|
QuantizationMethods.md | Quantization Methods |
Embedded Deep Learning.md | Run BNN in FPGA |
An Open Source FPGA CNN Library.pdf | Code: ZF-Net, Doc of An Open Source FPGA CNN Library |
Accelerating CNN inference on FPGAs- A Survey.pdf | Accelerating CNN inference on FPGAs: A Survey. |
-
https://github.com/Ewenwan/MVision/tree/master/CNN/Deep_Compression/
-
https://github.com/Ewenwan/pytorch-playground/blob/master/utee/quant.py
-
https://github.com/Ewenwan/MVision/tree/master/CNN/Deep_Compression/quantization
Our team is part of the DIG group of the State Key Laboratory of Software Development Environment (SKLSDE), supervised Prof. Xianglong Liu. The main research goals of our team is compressing and accelerating models under multiple scenes.
Ruihao Gong is currently a third-year graduate student at Beihang University under the supervision of Prof. Xianglong Liu. Since 2017, he worked on the build-up of computer vision systems and model quantization as an intern at Sensetime Research, where he enjoyed working with the talented researchers and grew up a lot with the help of Fengwei Yu, Wei Wu, and Junjie Yan. During the early time of the internship, he independently took responsibility for the development of intelligent video analysis system Sensevideo. Later, he started the research on model quantization which can speed up the inference and even the training of neural networks on edge devices. Now he is devoted to further promoting the accuracy of extremely low-bit models and the auto-deployment of quantized models.
I am a Ph.D. student (Sep 2019 - ) in the State Key Laboratory of Software Development Environment (SKLSDE) and ShenYuan Honors College at Beihang University, supervised by Prof. Wei Liand Prof. Xianglong Liu. I obtained a B.Eng degree in computer science and engineering from Beihang University. I was a research intern (Jun 2020 - Aug 2020) at the WeiXin Group of Tencent. In my undergraduate study, I interned at the Speech group of Microsoft Research Asia (MSRA) supervised by Dr. Wenping Hu. I'm interested in deep learning, computer vision, and model compression. My research goal is to enable state-of-the-art neural network models to be successfully deployed on resource-limited hardware. This includes compressing and accelerating models on multiple tasks, and flexible and efficient deployment for multiple hardware.
Xiangguo Zhang
Xiangguo Zhang is a second-year graduate student in the School of Computer Science of Beihang University, under the guidance of Prof. Xianglong Liu. He received a bachelor's degree from Shandong University in 2019 and entered Beihang University in the same year. Currently, he is interested in computer vision and post training quantization.
Yifu Ding
Yifu Ding is a senior student in the School of Computer Science and Engineering at Beihang University. She is in the State Key Laboratory of Software Development Environment (SKLSDE), under the supervision of Prof. Xianglong Liu. Currently, she is interested in computer vision and model quantization. She thinks that neural network models which are highly compressed can be deployed on resource-constrained devices. And among all the compression methods, quantization is a potential one.
Binary Neural Network: A Survey [PDF]
H. Qin, R. Gong, X. Liu*, X. Bai, J. Song, N. Sebe
Pattern Recognition (PR), 2020
Forward and Backward Information Retention for Accurate Binary Neural Networks [PDF]
H. Qin, R. Gong, X. Liu*, M. Shen, Z. Wei, F. Yu, J. Song
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
Boosting Temporal Binary Coding for Large-scale Video Search
Y. Wu, X. Liu*, H. Qin , K. Xia, S. Hu, Y. Ma, M. Wang
IEEE Transactions on Multimedia (TMM), 2020
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks
Ruihao Gong, Xianglong Liu*, Shenghu Jiang, Tianxiang Li, Peng Hu, Jiazhen Lin, Fengwei Yu, Junjie Yan
IEEE ICCV 2019
Towards Unified INT8 Training for Convolutional Neural Network
Feng Zhu, Ruihao Gong, Fengwei Yu, Xianglong Liu, Yanfei Wang, Zhelong Li, Xiuqi Yang, Junjie Yan
IEEE CVPR 2020
DMS: Differentiable Dimension Search for Binary Neural Networks
Yuhang Li and Ruihao Gong and Fengwei Yu and Xin Dong and Xianglong Liu
ICLR 2020 NAS workshop
Rotation Consistent Margin Loss for Efficient Low-bit Face Recognition
Yudong Wu, Yichao Wu, Ruihao Gong, Yuanhao Lv, Ken Chen, Ding Liang, Xiaolin Hu, Xianglong Liu, Junjie Yan
IEEE CVPR 2020
Balanced Binary Neural Networks with Gated Residual
Mingzhu Shen and Xianglong Liu and Ruihao Gong and Kai Han
ICASSP 2020