What's New

TF-enhanced calibration scheme has been accelerated using a custom CUDA kernel. Runs significantly faster now.
Installation instructions are now combined with rest of the documentation (User-Guide and API docs)

PyTorch

Fixed backward pass of the fake-quantize (QcQuantizeWrapper) nodes to handle symmetric mode correctly
Per-channel quantization is now enabled on a per-op-type basis
Support for recursively excluding module from a root module in QuantSim
Support for excluding layers when running model validator and model preparer
Reduced memory usage in AdaRound
Fixed bugs in AdaRound for per-channel quantization
Made ConnectedGraph more robust when identifying custom layers
Added jupyter notebook-based examples for the following features
AutoQuant: Added support for sparse conv layers in QuantSim (experimental)

Keras

Added support for Keras per-channel quantization
Changed interface to CLE to accept a pre-compiled model
Added jupyter notebook-based examples for the following features: Transformer quantization

TensorFlow

Documentation