Layerwise Quantization

We show that we can achieve quantization at a dynamic bit-level by doing per-layer quantization.

The code will be available here in the near future.

The paper is available at: https://arxiv.org/abs/2406.17415 and it is in review for EMNLP 2024.

If you decide to use please consider citing it using:

@misc{dumitru2024layerwisequantizationpragmaticeffective,
      title={Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels}, 
      author={Razvan-Gabriel Dumitru and Vikas Yadav and Rishabh Maheshwary and Paul-Ioan Clotan and Sathwik Tejaswi Madhusudhan and Mihai Surdeanu},
      year={2024},
      eprint={2406.17415},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2406.17415}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Layerwise Quantization

About

Releases

Packages

RazvanDu/LayerwiseQuant

Folders and files

Latest commit

History

Repository files navigation

Layerwise Quantization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages