Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CK BF16 Gemm #2617

Closed
wants to merge 1 commit into from
Closed

CK BF16 Gemm #2617

wants to merge 1 commit into from

Conversation

jwfromm
Copy link
Contributor

@jwfromm jwfromm commented May 21, 2024

Summary: Implementation of BF16 Gemm using the latest features from CK. Performance is comparable with hipblas but often a little worse. Detailed benchmarking can be found here. We see that for llama shapes, there likely isnt much benefit to this kernel. However, it may be useful for less common shapes that hipblas struggles with. We even see a few cases here where the ck kernel is slightly faster.

Reviewed By: jianyuh

Differential Revision: D57292145

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57292145

Copy link

netlify bot commented May 21, 2024

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 7afbfad
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/664f6b360edf6b000874209a
😎 Deploy Preview https://deploy-preview-2617--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@jwfromm jwfromm force-pushed the export-D57292145 branch from aa46654 to 782e12d Compare May 21, 2024 21:40
jwfromm added a commit to jwfromm/FBGEMM that referenced this pull request May 21, 2024
Summary:

Implementation of BF16 Gemm using the latest features from CK. Performance is comparable with hipblas but often a little worse. Detailed benchmarking can be found [here](https://docs.google.com/spreadsheets/d/10b9mRM6xCi1Iv-mRGkPjk37DfYQEDx3zU-EmV9t7f0s/edit?usp=sharing). We see that for llama shapes, there likely isnt much benefit to this kernel. However, it may be useful for less common shapes that hipblas struggles with. We even see a few cases here where the ck kernel is slightly faster.

Reviewed By: jianyuh

Differential Revision: D57292145
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57292145

jwfromm added a commit to jwfromm/FBGEMM that referenced this pull request May 21, 2024
Summary:

Implementation of BF16 Gemm using the latest features from CK. Performance is comparable with hipblas but often a little worse. Detailed benchmarking can be found [here](https://docs.google.com/spreadsheets/d/10b9mRM6xCi1Iv-mRGkPjk37DfYQEDx3zU-EmV9t7f0s/edit?usp=sharing). We see that for llama shapes, there likely isnt much benefit to this kernel. However, it may be useful for less common shapes that hipblas struggles with. We even see a few cases here where the ck kernel is slightly faster.

Reviewed By: jianyuh

Differential Revision: D57292145
@jwfromm jwfromm force-pushed the export-D57292145 branch from 782e12d to ae0625d Compare May 21, 2024 21:41
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57292145

Summary:

Implementation of BF16 Gemm using the latest features from CK. Performance is comparable with hipblas but often a little worse. Detailed benchmarking can be found [here](https://docs.google.com/spreadsheets/d/10b9mRM6xCi1Iv-mRGkPjk37DfYQEDx3zU-EmV9t7f0s/edit?usp=sharing). We see that for llama shapes, there likely isnt much benefit to this kernel. However, it may be useful for less common shapes that hipblas struggles with. We even see a few cases here where the ck kernel is slightly faster.

Reviewed By: jianyuh

Differential Revision: D57292145
@jwfromm jwfromm force-pushed the export-D57292145 branch from ae0625d to 7afbfad Compare May 23, 2024 16:13
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57292145

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 7930859.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants