-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Int4Tensor refactor to implements pattern #458
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/458
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@melvinebenezer thanks for working on this, can you take a look at |
return fn | ||
return decorator | ||
|
||
def _dynamically_quantize_per_channel_int4(x, quant_min, quant_max, target_dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also I feel this can probably be covered by AffineQuantizedTensor:
Affine quantized tensor subclass. Affine quantization means we quantize the floating point tensor with an affine transformation: |
dequantize_per_channel
is calling our unified quant_primitive ops: ao/torchao/quantization/utils.py
Line 260 in c2f9b84
def dequantize_per_channel(int_repr, scales, zero_points, out_dtype=torch.float32): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working on this
Todo
|
@melvinebenezer we have landed Uintx tensor subclass to affine quantized tensor recently: https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/Uintx.py we now have a UintxTensor here: ao/torchao/dtypes/uintx/Uintx.py Line 38 in ba2d3b1
UintxTensor might be helpful (add add some tests for the new ops)
|
@jerryzh168 Sure, will make the changes |
@jerryzh168 shall I close this draft and work on new one ? |
yeah opening a new one makes more sense I think, you can add tests in https://github.com/pytorch/ao/blob/main/test/dtypes/test_uintx.py I'm not sure if it's possible to migrate over all the implementations over, I'd suggest to start with a few tests like slicing tests etc. e.g. copy paste the test: Line 119 in e15e509
and do: sliced_weight = l.weight[1:2] , although I'm not sure what are all the ops that may make sense or useful right now, so feel free to just add a few popular ops for now, I'm also planning to put up a test suite for tensor subclasses this week.
another addition that might be helpful is to support distributed inference for uintx tensors, I'm starting a PR here: #785 and will likely have something ready soon so that you can follow to extend |
@jerryzh168 Yes makes total sense. will continue the conversation in discord and the new PR |
Refactoring
UInt4Tensor
to have implements pattern similar tonf4tensor
andUInt2Tensor
ToDo