Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : fix row condition for i8mm kernels #10561

Merged
merged 1 commit into from
Nov 28, 2024
Merged

Conversation

ggerganov
Copy link
Owner

@ggerganov ggerganov commented Nov 28, 2024

fix #10487 (comment)

In multi-threaded matrix multiplications, the total number of rows could be divisible by 2 (which is required by the I8MM Arm kernels), but the rows processed by the individual thread could be odd. This can either cause out-of-bounds writes or incorrect results for ne[2]*ne[3] > 1

make -j && ./bin/test-backend-ops -o MUL_MAT -b CPU

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Nov 28, 2024
@ggerganov
Copy link
Owner Author

cc @chaxu01

@ggerganov ggerganov merged commit 76b27d2 into master Nov 28, 2024
57 checks passed
@ggerganov ggerganov deleted the gg/cpu-q4_0-i8mm-fix branch November 28, 2024 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants