Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with GPTQ #274

Open
d0lphin opened this issue Jul 5, 2023 · 1 comment
Open

Issue with GPTQ #274

d0lphin opened this issue Jul 5, 2023 · 1 comment

Comments

@d0lphin
Copy link

d0lphin commented Jul 5, 2023

I have the following problem:

model=Honkware/openchat_8192-GPTQ

text-generation-launcher --model-id $model --num-shard 1 --quantize gptq --port 8080

Traceback (most recent call last):

  File "/home/abalogh/anaconda3/envs/text-generation-inference/bin/text-generation-server", line 8, in <module>
    sys.exit(app())
             ^^^^^

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/cli.py", line 78, in serve
    server.serve(

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/server.py", line 166, in serve
    asyncio.run(

  File "/home/abalogh/anaconda3/envs/text-generation-inference/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^

  File "/home/abalogh/anaconda3/envs/text-generation-inference/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/abalogh/anaconda3/envs/text-generation-inference/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/server.py", line 133, in serve_inner
    model = get_model(
            ^^^^^^^^^^

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/models/__init__.py", line 215, in get_model
    return FlashLlama(
           ^^^^^^^^^^^

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/models/flash_llama.py", line 65, in __init__
    model = FlashLlamaForCausalLM(config, weights)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/models/custom_modeling/flash_llama_modeling.py", line 366, in __init__
    self.model = FlashLlamaModel(config, weights)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/models/custom_modeling/flash_llama_modeling.py", line 305, in __init__
    [

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/models/custom_modeling/flash_llama_modeling.py", line 306, in <listcomp>
    FlashLlamaLayer(

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/models/custom_modeling/flash_llama_modeling.py", line 241, in __init__
    self.self_attn = FlashLlamaAttention(
                     ^^^^^^^^^^^^^^^^^^^^

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/models/custom_modeling/flash_llama_modeling.py", line 116, in __init__
    self.query_key_value = TensorParallelColumnLinear.load_multi(
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/utils/layers.py", line 238, in load_multi
    weight = weights.get_multi_weights_col(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/utils/weights.py", line 123, in get_multi_weights_col
    bits = self.get_tensor("gptq_bits").item()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/utils/weights.py", line 62, in get_tensor
    filename, tensor_name = self.get_filename(tensor_name)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/abalogh/work/work/contrib/text-generation-inference/server/text_generation_server/utils/weights.py", line 49, in get_filename
    raise RuntimeError(f"weight {tensor_name} does not exist")

RuntimeError: weight gptq_bits does not exist

@GemsFord
Copy link

same here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants