Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matrix multiplication support #285

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

ivarflakstad
Copy link

@ivarflakstad ivarflakstad commented Oct 18, 2023

Edit: outdated, see https://github.com/ivarflakstad/metal-rs for current state.

Leaving open for visibility.

Hey! Gave this a go.

Output from running cargo run --release --example mps-matrix-multiplication --features=mps

Correctness:
[=================================================] ✅

Performance:
Generating input matrices: (f32 4096x4096 and f16 4096x4096)
Running with transpose left: false, transpose right: false, alpha: 1, beta: 0
[=================================================] ✅
Avg GFLOPS: 2714.017354808345
Total time: 2.534734497s
Avg time: 50.694689ms

Running with transpose left: true, transpose right: false, alpha: 1, beta: 0
[=================================================] ✅
Avg GFLOPS: 2544.2273256244875
Total time: 2.702094294s
Avg time: 54.041885ms

Running with transpose left: false, transpose right: true, alpha: 1, beta: 0
[=================================================] ✅
Avg GFLOPS: 2696.778482443825
Total time: 2.550065042s
Avg time: 51.0013ms

Running with transpose left: false, transpose right: false, alpha: 0.5, beta: 0
[=================================================] ✅
Avg GFLOPS: 2668.0019304379844
Total time: 2.577410417s
Avg time: 51.548208ms

Running with transpose left: false, transpose right: false, alpha: 1, beta: 0.5
[=================================================] ✅
Avg GFLOPS: 2667.1376704607605
Total time: 2.578379166s
Avg time: 51.567583ms

@ivarflakstad ivarflakstad marked this pull request as ready for review October 25, 2023 12:47
@ivarflakstad ivarflakstad force-pushed the mps-matrix-multiplication-kernel branch from 6ece864 to aed9b29 Compare October 26, 2023 12:39
@Narsil
Copy link

Narsil commented Nov 1, 2023

@cwfitzgerald (Sorry for the ping if you're not the correct person).

We're iterating quite a bit on this here -> https://github.com/ivarflakstad/metal-rs
We would love some feedback if upstreaming here makes sense or not.

@cwfitzgerald
Copy link
Member

I'm not really involved in metal-rs development outside of administration and releases, so not really sure what fits and what doesn't :)

@ivarflakstad
Copy link
Author

Fair enough 😊
Maybe you know who does?

@EdorianDark
Copy link

@kvark Do you know who might be involved with metal-rs?

@Narsil
Copy link

Narsil commented Nov 20, 2023

@Wumpf Maybe ?

After having matured a bit this and implemented the Matrix in https://github.com/huggingface/candle/ I feel like this PR should be trimmed down to only get the MPSMatrix support and remove the rest of the sugar (being more aligned with the rest of the crate).

Ofc having maintainer advice would be helpful.
Also in order to be able to publish on our side, we're likely going to fork metal-rs for the time being to have access to those objects.

We might return to metal-rs once we get rid of MPSMatrix dependency (if we can using metal-flash-attention, which seems faster, but currently isn't as widely portable as MPSMatrix since it doesn't work on first gen m1).
Either that or when this PR gets included.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants