Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Granite and GraniteMoE models. #1818

Closed
wants to merge 4 commits into from
Closed

Conversation

janimo
Copy link
Contributor

@janimo janimo commented Oct 27, 2024

Rouge scores are good but logit values are very off.

@merrymercy
Copy link
Contributor

let us know when it is ready to be merged.

@janimo
Copy link
Contributor Author

janimo commented Nov 8, 2024

I still need to debug why the logits in the ONLY_RUN test go so much over the thresholds (e.g. 60 instead of << 1 like in others) even if the rogue scores match. You can merge if you wish and I can continue debugging or I'll update the PR when I find it. I am not sure whether implementation details should cause such large differences.

@frreiss
Copy link
Contributor

frreiss commented Dec 11, 2024

@janimo sorry, just noticed that this PR exists. I submitted a similar PR yesterday with Granite integration code that came out of our work on a NeurIPs booth demo. My PR is at #1818, and it appears to have been merged already.

@merrymercy would you mind also including @janimo in the credits for Granite support on the release notes of the next version?

@janimo
Copy link
Contributor Author

janimo commented Dec 11, 2024

@frreiss no problem. Apparently (I haven't tested, I had no GPU access lately) my missing the need for a granite specific chat template could have been the cause of the large logit value differences. Possibly the MoE model could pass now too.

@merrymercy
Copy link
Contributor

Sure. I will do that and close this PR for now.

@merrymercy merrymercy closed this Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants