[Bug] Cannot build `--quantization q3f16_1` model #1005

isaac621 · 2023-10-03T02:42:26Z

🐛 Bug

Meet an error while building q3f16_1 vicuna-7b-v1.5, the same error also occur when I try to building q3f16_1 for other models

To Reproduce

python3 -m mlc_llm.build --hf-path lmsys/vicuna-7b-v1.5 --target iphone --max-seq-len 4096 --quantization q3f16_1

Error Message

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/isaaclui/Desktop/oursky-project/mlc-llm/mlc_llm/build.py", line 13, in <module>
    main()
  File "/Users/isaaclui/Desktop/oursky-project/mlc-llm/mlc_llm/build.py", line 10, in main
    core.build_model_from_args(parsed_args)
  File "/Users/isaaclui/Desktop/oursky-project/mlc-llm/mlc_llm/core.py", line 653, in build_model_from_args
    build(mod, args)
  File "/Users/isaaclui/Desktop/oursky-project/mlc-llm/mlc_llm/core.py", line 514, in build
    mod_deploy = dl.ApplyDefaultSchedule(  # pylint: disable=not-callable
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/ir/transform.py", line 238, in __call__
    return _ffi_transform_api.RunPass(self, mod)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tvm/_ffi/_cython/./packed_func.pxi", line 332, in tvm._ffi._cy3.core.PackedFuncBase.__call__
  File "tvm/_ffi/_cython/./packed_func.pxi", line 263, in tvm._ffi._cy3.core.FuncCall
  File "tvm/_ffi/_cython/./packed_func.pxi", line 252, in tvm._ffi._cy3.core.FuncCall3
  File "tvm/_ffi/_cython/./base.pxi", line 182, in tvm._ffi._cy3.core.CHECK_CALL
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/_ffi/base.py", line 476, in raise_last_ffi_error
    raise py_err
  File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/ir/transform.py", line 307, in _pass_func
    return inst.transform_module(mod, ctx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/dlight/base/transform.py", line 64, in transform_module
    sch = _apply_rules(func, target, self.rules, tunable=False)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/dlight/base/transform.py", line 80, in _apply_rules
    space = rule.apply(func, target, tunable)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/dlight/gpu/gemv.py", line 191, in apply
    is_inner_reduction = normalize(sch, block_info)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/dlight/gpu/gemv.py", line 122, in normalize
    is_inner_reduction = iter_to_info[inner_axis].kind == "R"
                         ~~~~~~~~~~~~^^^^^^^^^^^^
KeyError: IterSplit(IterMark(v1, extent=T.int64(4096)), lower_factor=T.int64(1), extent=T.int64(4096), scale=T.int64(1))

The text was updated successfully, but these errors were encountered:

junrushao · 2023-10-03T03:48:50Z

I'm not exactly sure what happened but it seems your codebase has been a bit outdated. Would you like to use the latest TVM Unity? https://llm.mlc.ai/docs/install/tvm.html#option-1-prebuilt-package

isaac621 · 2023-10-03T07:12:12Z

@junrushao I have already reinstall the TVM Unity via the command python3 -m pip install --pre --force-reinstall -f https://mlc.ai/wheels mlc-ai-nightly, but same error is happening when I try to compile the model to q3f16_1.

FYI, it works well when I compile the model to q4f16_0

tqchen · 2023-10-03T12:57:04Z

cc @vinx13 can you help looking into this one?

isaac621 · 2023-10-05T05:59:15Z

@vinx13 Would like to know any update on this? Thanks a lot 🙇‍♂️

vinx13 · 2023-10-05T16:52:25Z

the regression is caused by apache/tvm#15665 I’m still working on a fix, as a workaround you can also revert it for now

isaac621 · 2023-10-06T05:15:53Z

@vinx13 For the workaround, should I build the tvm from the source code with commit before this apache/tvm#15665?

The same error occur after I install the tvm from the revert build and run the model compile script.

vinx13 · 2023-10-06T05:17:44Z

I applied the fix on the head of unity branch

isaac621 · 2023-10-06T06:31:51Z

I applied the fix on the head of unity branch

Cannot see the fix commit 🤔

junrushao · 2023-10-06T06:53:40Z

@vinx13 could you also cherry pick this fix to the mlc branch?

vinx13 · 2023-10-06T22:30:31Z

I sent a fix in apache/tvm#15881, it will be cherry-picked once merged

isaac621 · 2023-10-09T03:35:22Z

@vinx13 Have installed the tvm based on your fix commit on dc53a6c29. But the same problem still occur when I try to compile the q3f16. Anything have I done wrong?

vinx13 · 2023-10-09T21:50:17Z

I ran python3 -m mlc_llm.build --hf-path lmsys/vicuna-7b-v1.5 --target iphone --max-seq-len 4096 --quantization q3f16_1 on unity branch as of today and can compile successfully. Maybe you can check your installation and make sure you are using the tvm you installed. The fix merged to unity now, you can also install from https://llm.mlc.ai/docs/install/tvm.html#option-1-prebuilt-package

isaac621 · 2023-10-10T03:10:30Z

It works on my side now! Thx you a lot @vinx13 🔥

isaac621 added the bug Confirmed bugs label Oct 3, 2023

isaac621 closed this as completed Oct 10, 2023

junrushao mentioned this issue Oct 11, 2023

[Bug] Compiling Llama-2 with q3f16 and failed with errors KeyError: IterSplit(IterMark(v1, extent=T.int64(4096)), lower_factor=T.int64(1), extent=T.int64(4096), scale=T.int64(1)) #1023

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Cannot build `--quantization q3f16_1` model #1005

[Bug] Cannot build `--quantization q3f16_1` model #1005

isaac621 commented Oct 3, 2023

junrushao commented Oct 3, 2023

isaac621 commented Oct 3, 2023

tqchen commented Oct 3, 2023

isaac621 commented Oct 5, 2023

vinx13 commented Oct 5, 2023

isaac621 commented Oct 6, 2023

vinx13 commented Oct 6, 2023

isaac621 commented Oct 6, 2023 •

edited

Loading

junrushao commented Oct 6, 2023

vinx13 commented Oct 6, 2023

isaac621 commented Oct 9, 2023

vinx13 commented Oct 9, 2023

isaac621 commented Oct 10, 2023

[Bug] Cannot build --quantization q3f16_1 model #1005

[Bug] Cannot build --quantization q3f16_1 model #1005

Comments

isaac621 commented Oct 3, 2023

🐛 Bug

To Reproduce

Error Message

junrushao commented Oct 3, 2023

isaac621 commented Oct 3, 2023

tqchen commented Oct 3, 2023

isaac621 commented Oct 5, 2023

vinx13 commented Oct 5, 2023

isaac621 commented Oct 6, 2023

vinx13 commented Oct 6, 2023

isaac621 commented Oct 6, 2023 • edited Loading

junrushao commented Oct 6, 2023

vinx13 commented Oct 6, 2023

isaac621 commented Oct 9, 2023

vinx13 commented Oct 9, 2023

isaac621 commented Oct 10, 2023

[Bug] Cannot build `--quantization q3f16_1` model #1005

[Bug] Cannot build `--quantization q3f16_1` model #1005

isaac621 commented Oct 6, 2023 •

edited

Loading