Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: cannot import name '_grouped_size_compiled_for_decode_kernels' from 'flashinfer.decode' #549

Open
Hutlustc opened this issue Oct 23, 2024 · 4 comments

Comments

@Hutlustc
Copy link

I build from source and the version of flashinfer is 0.1.6

pip install -e .
Obtaining file:///home/hutl/api/flashinfer/python
  Preparing metadata (setup.py) ... done
Installing collected packages: flashinfer
  DEPRECATION: Legacy editable install of flashinfer==0.1.6 from file:///home/hutl/api/flashinfer/python (setup.py develop) is deprecated. pip 25.0 will enforce this behaviour change. A possible replacement is to add a pyproject.toml or enable --use-pep517, and use setuptools >= 64. If the resulting installation is not behaving as expected, try using --config-settings editable_mode=compat. Please consult the setuptools documentation for more information. Discussion can be found at https://github.com/pypa/pip/issues/11457
  Running setup.py develop for flashinfer
Successfully installed flashinfer-0.1.6

When I ran the demo of sglang, it gave me an error

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/hutl/api/sglang/python/sglang/launch_server.py", line 6, in <module>
    from sglang.srt.server import launch_server
  File "/home/hutl/api/sglang/python/sglang/srt/server.py", line 49, in <module>
    from sglang.srt.managers.data_parallel_controller import (
  File "/home/hutl/api/sglang/python/sglang/srt/managers/data_parallel_controller.py", line 29, in <module>
    from sglang.srt.managers.scheduler import run_scheduler_process
  File "/home/hutl/api/sglang/python/sglang/srt/managers/scheduler.py", line 61, in <module>
    from sglang.srt.managers.tp_worker import TpModelWorker
  File "/home/hutl/api/sglang/python/sglang/srt/managers/tp_worker.py", line 27, in <module>
    from sglang.srt.model_executor.model_runner import ModelRunner
  File "/home/hutl/api/sglang/python/sglang/srt/model_executor/model_runner.py", line 44, in <module>
    from sglang.srt.layers.attention.flashinfer_backend import FlashInferAttnBackend
  File "/home/hutl/api/sglang/python/sglang/srt/layers/attention/flashinfer_backend.py", line 33, in <module>
    from flashinfer.decode import _grouped_size_compiled_for_decode_kernels
ImportError: cannot import name '_grouped_size_compiled_for_decode_kernels' from 'flashinfer.decode' (/home/hutl/api/flashinfer/python/flashinfer/decode.py)

I tried reconfiguring the environment but it didn't work.

Are there any other possible solutions?

@yzh119
Copy link
Collaborator

yzh119 commented Oct 23, 2024

That function was removed recently (because with the new JIT feature, all group size can be compiled with JIT) in mainline, I can add this back for backward compatibility, but it's better to not rely on this function in sglang.

We can use some heuristic to control whether to use tensor cores or not. @hnyls2002 @merrymercy WDTY?

@yzh119
Copy link
Collaborator

yzh119 commented Oct 23, 2024

Or you can install our pre-built wheels through pypi, or install from v0.1.6 source at https://github.com/flashinfer-ai/flashinfer/tree/v0.1.6

@Hutlustc
Copy link
Author

Or you can install our pre-built wheels through pypi, or install from v0.1.6 source at https://github.com/flashinfer-ai/flashinfer/tree/v0.1.6

Thank you very much, I will try this method and hope it works.

@merrymercy
Copy link

merrymercy commented Oct 26, 2024

@yzh119 Can you provide a utility function inside flashinfer to decide whether to use tensor core?
Or can you do this decision automatically inside flashinfer when I pass use_tensor_core=="auto"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants