ggml : fix row condition for i8mm kernels #10561

ggerganov · 2024-11-28T10:25:59Z

In multi-threaded matrix multiplications, the total number of rows could be divisible by 2 (which is required by the I8MM Arm kernels), but the rows processed by the individual thread could be odd. This can either cause out-of-bounds writes or incorrect results for ne[2]*ne[3] > 1

make -j && ./bin/test-backend-ops -o MUL_MAT -b CPU

ggml-ci

ggerganov · 2024-11-28T10:27:57Z

cc @chaxu01

ggml : fix row condition for i8mm kernels

2e752c4

ggml-ci

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Nov 28, 2024

ggerganov requested a review from slaren November 28, 2024 10:26

ggerganov mentioned this pull request Nov 28, 2024

ggml : fix I8MM Q4_1 scaling factor conversion #10562

Merged

slaren approved these changes Nov 28, 2024

View reviewed changes

ggerganov merged commit 76b27d2 into master Nov 28, 2024
57 checks passed

ggerganov deleted the gg/cpu-q4_0-i8mm-fix branch November 28, 2024 12:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml : fix row condition for i8mm kernels #10561

ggml : fix row condition for i8mm kernels #10561

ggerganov commented Nov 28, 2024 •

edited

Loading

ggerganov commented Nov 28, 2024

ggml : fix row condition for i8mm kernels #10561

ggml : fix row condition for i8mm kernels #10561

Conversation

ggerganov commented Nov 28, 2024 • edited Loading

ggerganov commented Nov 28, 2024

ggerganov commented Nov 28, 2024 •

edited

Loading