Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use NonZeroU64 to optimize encoded_len_varint #1192

Conversation

mzabaluev
Copy link
Contributor

Give the compiler all the leverage to optimize encoded_len_varint:

  • Construct a NonZeroU64 to count leading zeros, as that can be faster on many platforms;
  • Use ilog2 instead of a handwritten expression to compute the base 2 algorithm, as the core library developers and the compiler would probably be in the best position to fine-tune it for all supported platforms.

With the varint benchmarks, I see slight improvements (1-3%) or no reproducible performance changes on 11th gen Intel Core and Mac M3.

The leading zeros count may perform better on many architectures
when the zero case is excluded.
Also use ilog2 as shorthand for the leading zeros trick because
it makes more clearly what we mean to get, and should be ideally
optimized by the compiler.
Copy link
Collaborator

@caspermeijn caspermeijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution

@caspermeijn caspermeijn added this pull request to the merge queue Nov 25, 2024
Merged via the queue into tokio-rs:master with commit 5ae30c5 Nov 25, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants