[quantize] All dimensions should be divisible by 32 for now #279

LMSPaul · 2024-01-10T10:51:50Z

Mlx doesn't seem to be working with quantizing with any model. I ran this command python convert.py --hf-path openchat/openchat-3.5-0106 -q

And I got this error Traceback (most recent call last): File "/Users/personal/Downloads/mlx-examples-main/lora/convert.py", line 89, in <module> weights, config = quantize(weights, config, args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/personal/Downloads/mlx-examples-main/lora/convert.py", line 21, in quantize nn.QuantizedLinear.quantize_module(model, args.q_group_size, args.q_bits) File "/Users/personal/Downloads/mlx-examples-main/lora/myenv/lib/python3.11/site-packages/mlx/nn/layers/quantized.py", line 124, in quantize_module leaves = tree_map(_quantize_if_linear, leaves, is_leaf=Module.is_module) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/personal/Downloads/mlx-examples-main/lora/myenv/lib/python3.11/site-packages/mlx/utils.py", line 49, in tree_map return { ^ File "/Users/personal/Downloads/mlx-examples-main/lora/myenv/lib/python3.11/site-packages/mlx/utils.py", line 50, in <dictcomp> k: tree_map(fn, child, *(r[k] for r in rest), is_leaf=is_leaf) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/personal/Downloads/mlx-examples-main/lora/myenv/lib/python3.11/site-packages/mlx/utils.py", line 41, in tree_map return fn(tree, *rest) ^^^^^^^^^^^^^^^ File "/Users/personal/Downloads/mlx-examples-main/lora/myenv/lib/python3.11/site-packages/mlx/nn/layers/quantized.py", line 119, in _quantize_if_linear return cls.from_linear(m, group_size, bits) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/personal/Downloads/mlx-examples-main/lora/myenv/lib/python3.11/site-packages/mlx/nn/layers/quantized.py", line 100, in from_linear ql = cls(input_dims, output_dims, False, group_size, bits) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/personal/Downloads/mlx-examples-main/lora/myenv/lib/python3.11/site-packages/mlx/nn/layers/quantized.py", line 58, in __init__ self.weight, self.scales, self.biases = mx.quantize(weight, group_size, bits) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: [quantize] All dimensions should be divisible by 32 for now

The text was updated successfully, but these errors were encountered:

awni · 2024-01-10T15:04:50Z

Quantization works with with many models but not all. Specifically if a model weight matrix dimension is not divisible by 32 then it will fail. This is on our roaadmap. I am closing as a dup of ml-explore/mlx#328

awni closed this as completed Jan 10, 2024

awni mentioned this issue Jan 10, 2024

Support quantization for non-multiples of 32. ml-explore/mlx#328

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[quantize] All dimensions should be divisible by 32 for now #279

[quantize] All dimensions should be divisible by 32 for now #279

LMSPaul commented Jan 10, 2024

awni commented Jan 10, 2024

[quantize] All dimensions should be divisible by 32 for now #279

[quantize] All dimensions should be divisible by 32 for now #279

Comments

LMSPaul commented Jan 10, 2024

awni commented Jan 10, 2024