Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Fix DMMV dequantization #9279

Merged
merged 4 commits into from
Sep 4, 2024

Conversation

OuadiElfarouki
Copy link
Collaborator

@OuadiElfarouki OuadiElfarouki commented Sep 2, 2024

MUL_MAT test-backend-ops currently fail on intel GPUs for Q4_1, Q5_0, Q5_1 and Q8_0 due to a small edge-case issue in the dequantize_mul_mat_vec kernel (when ncols <= GGML_SYCL_DMMV_X specifically).

This is a minor fix that halts the access to out-bound/extra quant elements in the kernel reduction step.

All unit-tests are passing following this fix.
Performance on intel GPUs is almost not affected.

@github-actions github-actions bot added the SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language label Sep 2, 2024
@joeatodd
Copy link
Collaborator

joeatodd commented Sep 3, 2024

I think given the perf implications of this bounds checking, we should dig a little deeper.

@OuadiElfarouki
Copy link
Collaborator Author

@joeatodd Agree

@OuadiElfarouki
Copy link
Collaborator Author

Updated fix and performance is preserved now.

Copy link
Collaborator

@joeatodd joeatodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ggml/src/ggml-sycl/dmmv.cpp Outdated Show resolved Hide resolved
@OuadiElfarouki OuadiElfarouki merged commit 5910ea9 into ggerganov:master Sep 4, 2024
52 checks passed
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
Fixed dmmv dequant for ncols== GGML_SYCL_DMMV_X
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
Fixed dmmv dequant for ncols== GGML_SYCL_DMMV_X
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants