Fix computation of quantization scales for symmetric quantization with GPTQ #2242

NehalBhandari · 2024-06-17T17:13:32Z

Currently, applying symmetric quantization using GPTQ algorithm with ct.optimize.torch.layerwise_compression.LayerwiseCompressor will not produce the correct quantization scales. This may lead to poor accuracy for the quantized model.

This PR fixes the above mentioned issue and adds corresponding unit tests

…h GPTQ

Fix computation of quantization scales for symmetric quantization wit…

83734f5

…h GPTQ

NehalBhandari requested review from TobyRoseman, aseemw and pulkital June 17, 2024 17:13

pulkital approved these changes Jun 17, 2024

View reviewed changes

aseemw approved these changes Jun 18, 2024

View reviewed changes

NehalBhandari merged commit 5313fc7 into main Jun 18, 2024

NehalBhandari deleted the nehalbhandari/gptq-fix branch June 18, 2024 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix computation of quantization scales for symmetric quantization with GPTQ #2242

Fix computation of quantization scales for symmetric quantization with GPTQ #2242

NehalBhandari commented Jun 17, 2024

Fix computation of quantization scales for symmetric quantization with GPTQ #2242

Fix computation of quantization scales for symmetric quantization with GPTQ #2242

Conversation

NehalBhandari commented Jun 17, 2024