version 1.26.0
What's New
Keras
- Added a feature called BN Re-estimation that can improve model accuracy after QAT for INT4 quantization.
- Updated the AutoQuant feature to automatically choose the optimal calibration scheme, create an HTML report on which optimizations were applied.
- Update to Model Preparer to replace separable conventional with depth wise and point wise conv layers.
- Fixes BN fold implementation to account for a subsequent multi-input layer
- Fixed a bug where min/max encoding values were not aligned with scale/offset during QAT.
PyTorch
- Several bug fixes
TensorFlow
- Added a feature called BN Re-estimation that can improve model accuracy after QAT for INT4 quantization
- Updated the AutoQuant feature to automatically choose the optimal calibration scheme, create an HTML report on which optimizations were applied.
- Fixed a bug where min/max encoding values were not aligned with scale/offset during QAT.
Common
- Documentation updates for taking AIMET models to target.
- Standalone Batchnorm layers parameter’s conversion such that it will behave as linear/dense layer.
Experimental
- Added new Architecture Checker feature to identify and report model architecture constructs that are not ideal for quantized runtimes. Users can utilize this information to change their model architectures accordingly.
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/1.26.0
- Installation guide: https://quic.github.io/aimet-pages/releases/1.26.0/install/index.html
- User guide: https://quic.github.io/aimet-pages/releases/1.26.0/user_guide/index.html
- API documentation: https://quic.github.io/aimet-pages/releases/1.26.0/api_docs/index.html
- Documentation main page: https://quic.github.io/aimet-pages/index.html