Skip to content
This repository has been archived by the owner on Oct 17, 2024. It is now read-only.
/ DAQ Public archive

Weight only quantiztion method for LLMs in FP4 format

License

Notifications You must be signed in to change notification settings

LuoYingSong/DAQ

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs

Install

  1. Clone this repository and navigate to DAQ folder
git clone http://this/repo
cd daq
  1. Install Package
conda create -n daq python=3.10 -y
conda activate daq
pip install --upgrade pip  # enable PEP 660 support
pip install -e .

Usage

To run DAQ:

python awq/entry.py --sample 1 --model_path /Llama-2-7b-hf --run_daq --tasks wikitext --w_bit 4 --q_group_size -1 --q_backend fake --dump daq_cache/Llama-2-7b-hf.pt

To run DAQ+AWQ:

python awq/entry.py --model_path /Llama-2-7b-hf --calibration daq --run_awq --tasks wikitext --w_bit 4 --q_group_size -1 --q_backend fake --dump awq_cache/Llama-2-7b-hf.pt --sample 2 --data_type nf4

Acknowledgements

We would like to express our gratitude to the AWQ project for their pioneering work in weight quantization for LLMs. Our work builds upon their insights and implementations.

About

Weight only quantiztion method for LLMs in FP4 format

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published