-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Question) Performance on SIMD #391
Comments
Another in Haskell https://github.com/chadbrewbaker/endoscope/blob/master/src/Endoscope.hs - janky and shells out to z3py for some stuff but generic over all semigroups. My advice is to keep things 32bit in the generic case for now so you can bit-blast the full 4GB range of a reference implementation and a candidate implementation to prove correctness. As I understand MOJO - you might be able to pick up some of Chris Lattner's MLIR tricks for free if you can target his MLIR. A lot of the MOJO tricks are just making the runtime ergonomic for benchmarking such as tuning buffer sizes for a new arch.
For many problems you have to use cache-oblivious algorithms and succinct data structures. The compiler can only get you so far.
|
About sequential simd operations, I haven't thought about it yet, maybe victor has some ideas. But it most likely will be done at some point. |
@developedby Amazing to know that you aim to improve the single-thread performance! I saw the website benchmark and the performance is only good with many GPU cores. |
How do you plan to optimize serial processing on SIMD?
Another thing, how your project bypassed this law?
The text was updated successfully, but these errors were encountered: