Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate performance regression of ChaCha20 implementation #244

Closed
brycx opened this issue Oct 31, 2021 · 6 comments
Closed

Investigate performance regression of ChaCha20 implementation #244

brycx opened this issue Oct 31, 2021 · 6 comments
Assignees
Labels
bug Something isn't working investigation Investigation task
Milestone

Comments

@brycx
Copy link
Member

brycx commented Oct 31, 2021

The following changes in performance of ChaCha20 and related functions have been observed:

rustc 1.41.0 (5e1a79984 2020-01-27), orion 0.15.0:

  • ChaCha20Poly1305: Input 128KiB, Throughput 419.22 MiB/s
  • XChaCha20Poly1305: Input 128KiB, Throughput 419.53 MiB/s

rustc 1.56.0 (09c42c458 2021-10-18), orion 0.16.1:

  • ChaCha20Poly1305: Input 128KiB, Throughput 275.66 MiB/s
  • XChaCha20Poly1305: Input 128KiB, Throughput 275.19 MiB/s

The change in performance was only observed on an Intel(R) Core(TM) i7-4790K CPU machine. No regression was found on the Raspberry Pi 2 Model B V1.1 used to benchmark. When running cargo bench with orion v0.15.0 and rustc 1.56.0 (09c42c458 2021-10-18), the same regressions are shown.

It doesn't seem like any major changes were introduced in the ChaCha20 implementation between these versions, when inspecting git log. This, and that regression persists between different versions of Orion, tells us its most likely due to changes outside of this crate.

Further things to investigate:

  • If the regression persists between rustc versions 1.41.0 to 1.56.0 (regression due to changes in rustc, maybe LLVM upgrades?)
  • If any changes in dependencies, could have caused this
@brycx brycx added bug Something isn't working investigation Investigation task labels Oct 31, 2021
@brycx
Copy link
Member Author

brycx commented Nov 1, 2021

The BLAKE2b implementation uses the U64x4 type whereas ChaCha20 uses the U32x4 type. BLAKE2b has shown no performance regressions.

The assembly generated for U32x4 is different from U64x4. It matches U64x4 in size up to 1.47.0 of Rust, then gets more bloated up to incl. 1.49.0 and then doubles in size starting at 1.50.0.

https://godbolt.org/z/s6r8hM6sv

Assembly bloat present in 1.58.0 as well.

@brycx brycx added this to the 0.17.2 milestone Feb 2, 2022
@brycx brycx self-assigned this Feb 2, 2022
@brycx
Copy link
Member Author

brycx commented Feb 2, 2022

Reverting to a 0.14.3 based implementation (https://github.com/orion-rs/orion/tree/chacha20-perf), throughput reaches ~380 MiB/s.

@brycx brycx modified the milestones: 0.17.2, 0.17.3 Aug 16, 2022
@brycx
Copy link
Member Author

brycx commented Aug 19, 2022

Checking Godbolt, we are getting the previous sizes of generated assembly for this unit (rustc 1.63.0): https://godbolt.org/z/fobdWGrM1.

The throughput should be measured again, to verify that performance has restored to acceptable levels. If that's the case, we can close this issue.

@vlmutolo
Copy link
Contributor

Do we know if there's a Rust issue created for this regression yet? If not, they may want to be aware of it. Not sure if these kinds of regressions end up in a test suite somewhere.

@brycx
Copy link
Member Author

brycx commented Aug 21, 2022

I haven't seen any when I looked. It was a bit ago last time I did however. I'm not sure what to specifically report, but if you're interested in doing so, feel free to.

@brycx
Copy link
Member Author

brycx commented Aug 21, 2022

I am not able to reproduce the exact benchmarking setup originally used, but I'm now seeing 400-420 MiB/s throughput again, but unfortunately there's some noise on my current setup. I would say this to be back to near-original levels so we can close this.

If it reappears or others report degraded performance, we can always repon this issue.

@brycx brycx closed this as completed Aug 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working investigation Investigation task
Projects
None yet
Development

No branches or pull requests

2 participants