We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The same input got very different outputs with different datatype. And the bf16_fp16's output is aligned with torch.
python demo.py -t /mnt/data/LLM_Models/Qwen-14B-Chat/ -m /mnt/data/LLM_Models/Qwen-14B-Chat/cpu/ --do_sample False --output_len 512 --dtype bf16_fp16 [INFO] xfastertransformer is not installed in pip, using source code. [INFO] SINGLE_INSTANCE MODE. 大车行走速度在正常情况下,速度的设定分为10%、20%、50%和80%,具体数值取决于实际情况和需求。
python demo.py -t /mnt/data/LLM_Models/Qwen-14B-Chat/ -m /mnt/data/LLM_Models/Qwen-14B-Chat/cpu/ --do_sample False --output_len 512 --dtype fp16 [INFO] xfastertransformer is not installed in pip, using source code. [INFO] SINGLE_INSTANCE MODE. 大,
The text was updated successfully, but these errors were encountered:
git log :
commit 1d1c483 (HEAD -> main, origin/main, origin/HEAD) Author: marvinYu weifei.yu@intel.com Date: Tue Feb 6 13:13:36 2024 +0800
[ci] Add workflow permission. (#218)
Sorry, something went wrong.
dtype bf16 got the same output with fp16. It looks like generate() stopped after the first token.
python demo2.py -t /mnt/data/LLM_Models/Qwen-14B-Chat/ -m /mnt/data/LLM_Models/Qwen-14B-Chat/cpu/ --do_sample False --output_len 512 --dtype bf16 [INFO] xfastertransformer is not installed in pip, using source code. [INFO] SINGLE_INSTANCE MODE. 大,
changqi1
No branches or pull requests
The same input got very different outputs with different datatype. And the bf16_fp16's output is aligned with torch.
BF16_FP16 datatype:
python demo.py -t /mnt/data/LLM_Models/Qwen-14B-Chat/ -m /mnt/data/LLM_Models/Qwen-14B-Chat/cpu/ --do_sample False --output_len 512 --dtype bf16_fp16
[INFO] xfastertransformer is not installed in pip, using source code.
[INFO] SINGLE_INSTANCE MODE.
大车行走速度在正常情况下,速度的设定分为10%、20%、50%和80%,具体数值取决于实际情况和需求。
FP16 datatype:
python demo.py -t /mnt/data/LLM_Models/Qwen-14B-Chat/ -m /mnt/data/LLM_Models/Qwen-14B-Chat/cpu/ --do_sample False --output_len 512 --dtype fp16
[INFO] xfastertransformer is not installed in pip, using source code.
[INFO] SINGLE_INSTANCE MODE.
大,
The text was updated successfully, but these errors were encountered: