Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
arm64: lse: Prefetch operands to speed up atomic operations
On a Kryo 485 CPU (semi-custom Cortex-A76 derivative) in a Snapdragon 855 (SM8150) SoC, switching from traditional LL/SC atomics to LSE causes LKDTM's ATOMIC_TIMING test to regress by 2x: LL/SC ATOMIC_TIMING: 34.14s 34.08s LSE ATOMIC_TIMING: 70.84s 71.06s Prefetching the target operands fixes the regression and makes LSE perform better than LSE as expected: LSE+prfm ATOMIC_TIMING: 21.36s 21.21s "dd if=/dev/zero of=/dev/null count=10000000" also runs faster: LL/SC: 3.3 3.2 3.3 s LSE: 3.1 3.2 3.2 s LSE+p: 2.3 2.3 2.3 s Commit 0ea366f applied the same change to LL/SC atomics, but it was never ported to LSE. Signed-off-by: Danny Lin <danny@kdrag0n.dev> [Kazuki: Port to v5.4] Signed-off-by: Kazuki Hashimoto <kazukih@tuta.io>
- Loading branch information