Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NVIDIA] Use the fast accumulation for FP8 matmul #35

Merged
merged 1 commit into from
Nov 15, 2023

Conversation

kaixih
Copy link
Contributor

@kaixih kaixih commented Nov 2, 2023

As highlighted in this issue, this PR enables the use of fast accumulation for fprop FP8 matmul. This adjustment aligns with the changes implemented by Flax in this PR. It's important to note that this PR reuses the functions provided in the Flax change. Therefore, kindly consider merging this pull request after the completion of that one.

cc. @wenscarl @nluehr

@zhangqiaorjc zhangqiaorjc self-assigned this Nov 8, 2023
@kaixih
Copy link
Contributor Author

kaixih commented Nov 13, 2023

@zhangqiaorjc It seems the PR has stuck in the pull ready status for a while. Can you take a look?

@kaixih
Copy link
Contributor Author

kaixih commented Nov 15, 2023

Gentle ping @zhangqiaorjc

@copybara-service copybara-service bot merged commit e7c8561 into google:main Nov 15, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants