What's New

Keras

Added a feature called BN Re-estimation that can improve model accuracy after QAT for INT4 quantization.
Updated the AutoQuant feature to automatically choose the optimal calibration scheme, create an HTML report on which optimizations were applied.
Update to Model Preparer to replace separable conventional with depth wise and point wise conv layers.
Fixes BN fold implementation to account for a subsequent multi-input layer
Fixed a bug where min/max encoding values were not aligned with scale/offset during QAT.

PyTorch

TensorFlow

Added a feature called BN Re-estimation that can improve model accuracy after QAT for INT4 quantization
Updated the AutoQuant feature to automatically choose the optimal calibration scheme, create an HTML report on which optimizations were applied.
Fixed a bug where min/max encoding values were not aligned with scale/offset during QAT.

Common

Documentation updates for taking AIMET models to target.
Standalone Batchnorm layers parameter’s conversion such that it will behave as linear/dense layer.

Experimental

Added new Architecture Checker feature to identify and report model architecture constructs that are not ideal for quantized runtimes. Users can utilize this information to change their model architectures accordingly.

Documentation