-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: feat: Support DBRX model in Llama #462
base: master
Are you sure you want to change the base?
Draft: feat: Support DBRX model in Llama #462
Conversation
f852b16
to
27b9d62
Compare
+ "Generation speed is significantly faster than LLaMA2-70B, while at the same time " | ||
+ "beating other open source models, such as, LLaMA2-70B, Mixtral, and Grok-1 on " | ||
+ "language understanding, programming, math, and logic.", | ||
PromptTemplate.LLAMA, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it uses ChatML prompt template - PromptTemplate.CHAT_ML
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, done
Since the change was recent, we need to update the llama.cpp submodule as well |
27b9d62
to
3bc8480
Compare
Done |
I'll try running the model locally soon and see if any other changes are necessary |
Great! But in this PR I have to implement downloading all 10 files first I guess... 😅 |
4fb52b2
to
7aa08d9
Compare
7aa08d9
to
05cdeed
Compare
05cdeed
to
c87c1b1
Compare
@phymbert I can download https://huggingface.co/phymbert/dbrx-16x12b-instruct-iq3_xxs-gguf without login in the browser, but inside the plugin I get 403 Forbidden, is this to be expected with the |
Dbrx is a gated model, so I believe you have to pass a read token. There is an issue open on llama.cpp to support this. |
7683b53
to
6479604
Compare
e57fa37
to
c417cca
Compare
ea9d9ee
to
b4dfde3
Compare
The new Open Source model DBRX sounds amazing, is this enough and correct to integrate it into Llama?
ggerganov/llama.cpp#6515
https://huggingface.co/collections/phymbert/dbrx-16x12b-instruct-gguf-6619a7a4b7c50831dd33c7c8
https://www.databricks.com/blog/announcing-dbrx-new-standard-efficient-open-source-customizable-llms
https://github.com/databricks/dbrx
https://huggingface.co/collections/databricks/
llama.cpp seems to support splitted/sharded files, but I would need to download all of them first I suppose... 😅