SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
-
Updated
Dec 18, 2024 - Python
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、reg…
Neural Network Compression Framework for enhanced OpenVINO™ inference
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
YOLO ModelCompression MultidatasetTraining
A model compression and acceleration toolbox based on pytorch.
0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture
针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库
FrostNet: Towards Quantization-Aware Network Architecture Search
OpenVINO Training Extensions Object Detection
Quantization Aware Training
Train neural networks with joint quantization and pruning on both weights and activations using any pytorch modules
Quantization-aware training with spiking neural networks
3rd place solution for NeurIPS 2019 MicroNet challenge
QT-DOG: QUANTIZATION-AWARE TRAINING FOR DOMAIN GENERALIZATION
A tutorial of model quantization using TensorFlow
Image classification done with Mindspore technology
Code for the ISCAS23 paper "The Hardware Impact of Quantization and Pruning for Weights in Spiking Neural Networks"
BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks, ECCV 2022
An example to quantize MobileNetV2 trained on CIFAR-10 dataset with PyTorch FX graph mode quantization
Add a description, image, and links to the quantization-aware-training topic page so that developers can more easily learn about it.
To associate your repository with the quantization-aware-training topic, visit your repo's landing page and select "manage topics."