-
Notifications
You must be signed in to change notification settings - Fork 996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llama3.1 support removed? #2121
Comments
Tagging @kaiyux |
Running the same issue, bumped it to
Ive ignored the pip warnings
I am able to create engines again. |
Hi @dhruvmullick , thanks for reporting such issue. And we also mentioned that we need to upgrade transformer version to run llama3.1. Maybe it's not very clear at this moment, we'll try to imporve the quality of the doc. In short, TRT-LLM doesn't remove the llama 3.1 supporting but it needs to upgrade transformer to 4.43+ for running. |
@nv-guomingz do you plan to support it again without the need to bump transformers version again? |
@nv-guomingz thank you for the note! |
Yes, we're testing the transformer 4.44.0 functionality in internal CI and we'll update transformer version if it pass the testing |
System Info
A100
Who can help?
@byshiue
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Build from source https://github.com/triton-inference-server/tensorrtllm_backend
Install the latest transformers version supported by tensorrt-llm
Expected behavior
transformers 4.43.1 should be supported by tensorrt_llm, hence supporting Llama3.1.
This is as per https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct which says transformers version should be > 4.43
actual behavior
Llama3.1 is not supported because transformers needs to be <=4.42.4,>=4.38.2 for this tensorrt_llm
additional notes
As per #2008, transformers 4.43.1 was supported, allowing use of Llama3.1
However, in the latest Tensort-llm, this transformers version is not supported anymore #2094
Why is the support for Llama3.1 removed? Was there a bug?
The text was updated successfully, but these errors were encountered: