Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plans for 8da4w quantization #883

Closed
sanchitintel opened this issue Sep 12, 2024 · 6 comments
Closed

Plans for 8da4w quantization #883

sanchitintel opened this issue Sep 12, 2024 · 6 comments

Comments

@sanchitintel
Copy link
Contributor

Hi,

From #430, it seems that 8da4w is primarily for Executorch, and is set to be deprecated. Please advise if there are any plans to enable it for CUDA & CPU as well, such that int4 weights could be converted to int8 just before computation?

Thanks!

cc @jerryzh168

@Xia-Weiwen
Copy link
Collaborator

Hi @jerryzh168 I saw your pointer here:
https://github.com/pytorch/ao/tree/main/torchao/quantization#to-be-deprecated-a8w8-dynamic-quantization
However, we need 8da4w for CPU and XPU and we don't want it deprecated. May I know any concern from your side? Thanks

CC. @jgong5 @leslie-fang-intel

@jerryzh168
Copy link
Contributor

this is used for executorch before, but it seems that we have people adding kernels here: #880. we are open to adding kernels for this

@Xia-Weiwen
Copy link
Collaborator

this is used for executorch before, but it seems that we have people adding kernels here: #880. we are open to adding kernels for this

Got it. Thanks for the clarification!

@jerryzh168
Copy link
Contributor

jerryzh168 commented Sep 13, 2024

@Xia-Weiwen that link is talking about the quantizer API since we are updating to the quantize_ API, we'll be using something like https://github.com/pytorch/ao/tree/main/torchao/quantization#a8w8-dynamic-quantization but with

"int8_dynamic_activation_int4_weight",
as the second argument. i.e.:

quantize_(model, int8_dynamic_activation_int4_weight())

@Xia-Weiwen
Copy link
Collaborator

Thanks for the pointers!

@sanchitintel
Copy link
Contributor Author

Thanks for clarifying, @jerryzh168!

yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants