-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support gemma2 in pytorch engine #1924
Conversation
@@ -1127,11 +1127,13 @@ def __init__(self, | |||
eoh='<end_of_turn>\n', | |||
assistant='<start_of_turn>model\n', | |||
eoa='<end_of_turn>\n', | |||
stop_words=['<end_of_turn>'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does Gemma use the stop_words too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I guess. Generate eos should means stop.
Gemma2 requires transformers_version "4.42.0.dev0" |
Got failure when chatting with
|
@zhulinJulia24 may put |
We have not tested other models in 4.42.3 |
@zhulinJulia24 Could you help pull a full test with |
fixed |
Hi @grimoire Does the implementation support 8k context? |
@zhyncs soft-capping has not been supported. |
ok. Do we have a plan for support and when is it expected? |
support soft-capping requires update attention kernel. Adding new features to the kernel is not difficult, but considering stability, I will not prioritize support for new features as the highest priority. |
Gemma and Gemma2 code share a lot in common.