Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono] unify and vectorize implementation of decode_value metadata API #100048

Closed
wants to merge 9 commits into from

Conversation

kg
Copy link
Member

@kg kg commented Mar 20, 2024

We have multiple copies of the same decode_value function spread across the mono runtime. This PR unifies them all into a single implementation under metadata/, and then vectorizes it using clang vector builtins.

A basic benchmark on x64 using clang -O3 showed a 13% time reduction, and the generated wasm for this is pretty efficient, so I'm hoping it will be a small startup time win for any target where we enable it. It's hard to actually measure in practice locally though...

I verified that the vectorized implementation works correctly by comparing its output against the scalar version for all possible 5-byte sequences, so this should be a safe switch-over as long as there aren't endianness issues.

Pending stuff to fix for this PR:

  • Enable SIMD for metadata on wasm the correct way
  • Make certain that it's actually safe to swap all the other copies of the algorithm over to this one (at least one appears to have contained a bug)
  • Is it safe to overrun when reading on other targets? POSIX rounds mmap up to page-sized units, and extra bytes off the end are zeroes, so it might be?

@kg kg added NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) runtime-mono specific to the Mono runtime labels Mar 20, 2024
@kg
Copy link
Member Author

kg commented Mar 20, 2024

cc @radekdoulik re our earlier conversation about simd in other parts of the runtime

Copy link
Contributor

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

Copy link
Contributor

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

Copy link
Contributor

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

@kg kg reopened this Oct 10, 2024
@kg
Copy link
Member Author

kg commented Oct 10, 2024

@lewing are there plans to enable SIMD in the wasm runtime for NET10? It would be needed to be able to land this. Doing it would also make simdhash much faster.

@lewing
Copy link
Member

lewing commented Oct 10, 2024

do you have numbers for how much faster. If the gain is large enough there are a few options we could explore

@kg
Copy link
Member Author

kg commented Oct 10, 2024

do you have numbers for how much faster. If the gain is large enough there are a few options we could explore

IIRC when I tested it this PR was like a 5-10% improvement to decode_value. For simdhash it makes lookups something like 10-40% faster depending on a bunch of factors, with the caveat that hashtable ops are only 10-15% of time spent during a startup profile, so that means we'd be looking at a like 1-4% actual savings

Copy link
Contributor

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-VM-meta-mono runtime-mono specific to the Mono runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants