-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support int_scaled_mm on CPU #121
Conversation
@Xia-Weiwen , @cpuhrsch , can you also add |
Hi @jgong5 @yanbing-j Could you please comment about FP8? Thanks. |
At present, we are preparing to add CPU support of |
Description
int_scaled_mm
is supported on CUDA only now. This PRs adds support for CPU.The op is implemented by
torch._int_mm
, whose CPU version has been added to PyTorch recently by pytorch/pytorch#121792.With this patch, SmoothQuant can go with
int_scaled_mm
on CPU with Inductor.Example code:
Run with
TORCHAO_AUTOTUNER_ENABLE=1
and the following is found in the generated code:Test plan
python test/kernel/test_autotuner.py -k test_int_scaled_mm