[WIP] Int4Tensor refactor to implements pattern #458

melvinebenezer · 2024-06-28T05:36:19Z

Refactoring UInt4Tensor to have implements pattern similar to nf4tensor and UInt2Tensor

ToDo

Create implements for UInt4Tensor and PerChannelSymmetricWeight
Test Cases
Move uint4i to uint4.py

pytorch-bot · 2024-06-28T05:36:22Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/458

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2024-06-28T17:16:32Z

@melvinebenezer thanks for working on this, can you take a look at Developer API in #391, this has the structure of what new dtype tensor subclass can use, the current UInt4Tensor was created before we have this doc available, and now we can indeed work on improving that to align with our plan

jerryzh168 · 2024-06-28T17:18:07Z

torchao/dtypes/channel_symmetricweight.py

+        return fn
+    return decorator
+
+def _dynamically_quantize_per_channel_int4(x, quant_min, quant_max, target_dtype):


also I feel this can probably be covered by AffineQuantizedTensor:

ao/torchao/dtypes/affine_quantized_tensor.py

Line 86 in c2f9b84

Affine quantized tensor subclass. Affine quantization means we quantize the floating point tensor with an affine transformation:

, and the previous dequantize_per_channel is calling our unified quant_primitive ops:

ao/torchao/quantization/utils.py

Line 260 in c2f9b84

def dequantize_per_channel(int_repr, scales, zero_points, out_dtype=torch.float32):

Working on this

…#391

melvinebenezer · 2024-07-07T04:57:58Z

Todo

Use AffineQuantizedTensor
Convert test cases to pytest to maintain uniformity

jerryzh168 · 2024-08-30T17:54:51Z

@melvinebenezer we have landed Uintx tensor subclass to affine quantized tensor recently: https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/Uintx.py

we now have a UintxTensor here:

ao/torchao/dtypes/uintx/Uintx.py

Line 38 in ba2d3b1

class UintxTensor(torch.Tensor):

so uint4 tensor can probably be deprecated for now, I think moving over the relevant operator implementation to UintxTensor might be helpful (add add some tests for the new ops)

melvinebenezer · 2024-08-31T01:57:18Z

@jerryzh168 Sure, will make the changes

melvinebenezer · 2024-08-31T03:14:59Z

@jerryzh168 shall I close this draft and work on new one ?

jerryzh168 · 2024-09-03T03:07:14Z

@jerryzh168 shall I close this draft and work on new one ?

yeah opening a new one makes more sense I think, you can add tests in https://github.com/pytorch/ao/blob/main/test/dtypes/test_uintx.py

I'm not sure if it's possible to migrate over all the implementations over, I'd suggest to start with a few tests like slicing tests etc.

e.g. copy paste the test:

ao/test/dtypes/test_uintx.py

Line 119 in e15e509

uintx_weight_only(dtype)(l)

and do: sliced_weight = l.weight[1:2], although I'm not sure what are all the ops that may make sense or useful right now, so feel free to just add a few popular ops for now, I'm also planning to put up a test suite for tensor subclasses this week.

another addition that might be helpful is to support distributed inference for uintx tensors, I'm starting a PR here: #785 and will likely have something ready soon so that you can follow to extend UintxAQTLayout to support distributed inference

melvinebenezer · 2024-09-03T03:22:13Z

@jerryzh168 Yes makes total sense. will continue the conversation in discord and the new PR

melvinebenezer added 2 commits June 28, 2024 10:37

refactor uint4 to implements pattern

a686e6b

refactored per channel symmetric weight

c0497b8

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 28, 2024

jerryzh168 reviewed Jun 28, 2024

View reviewed changes

updated uint4 and perchannel_symmetricweight based on new API pytorch…

d54399c

…#391

melvinebenezer added 3 commits July 17, 2024 06:20

Merge branch 'main' into int4_refactor

42170df

Merge branch 'main' into int4_refactor

ba1e090

fix: adapt to implements from utils

ab5ffd7

melvinebenezer closed this Sep 3, 2024

melvinebenezer mentioned this pull request Oct 7, 2024

Uintx ops - Slice etc... #1026

Draft

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Int4Tensor refactor to implements pattern #458

[WIP] Int4Tensor refactor to implements pattern #458

melvinebenezer commented Jun 28, 2024 •

edited

Loading

pytorch-bot bot commented Jun 28, 2024 •

edited

Loading

jerryzh168 commented Jun 28, 2024 •

edited

Loading

jerryzh168 Jun 28, 2024

melvinebenezer Jul 7, 2024

melvinebenezer commented Jul 7, 2024

jerryzh168 commented Aug 30, 2024 •

edited

Loading

melvinebenezer commented Aug 31, 2024

melvinebenezer commented Aug 31, 2024

jerryzh168 commented Sep 3, 2024 •

edited

Loading

melvinebenezer commented Sep 3, 2024

[WIP] Int4Tensor refactor to implements pattern #458

[WIP] Int4Tensor refactor to implements pattern #458

Conversation

melvinebenezer commented Jun 28, 2024 • edited Loading

ToDo

pytorch-bot bot commented Jun 28, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/458

jerryzh168 commented Jun 28, 2024 • edited Loading

jerryzh168 Jun 28, 2024

Choose a reason for hiding this comment

melvinebenezer Jul 7, 2024

Choose a reason for hiding this comment

melvinebenezer commented Jul 7, 2024

jerryzh168 commented Aug 30, 2024 • edited Loading

melvinebenezer commented Aug 31, 2024

melvinebenezer commented Aug 31, 2024

jerryzh168 commented Sep 3, 2024 • edited Loading

melvinebenezer commented Sep 3, 2024

melvinebenezer commented Jun 28, 2024 •

edited

Loading

pytorch-bot bot commented Jun 28, 2024 •

edited

Loading

jerryzh168 commented Jun 28, 2024 •

edited

Loading

jerryzh168 commented Aug 30, 2024 •

edited

Loading

jerryzh168 commented Sep 3, 2024 •

edited

Loading