[FR] 90% speed up by refactoring and optimizing some code #385

genivia-inc · 2024-04-24T20:00:45Z

ugrep can run faster by refactoring the search logic to break up the large code block in advance() into separate functions that get called quicker e.g. by a switch or function pointer to skip conditionals. Breaking up this large function helps the compiler a lot to optimize this code better than having to analyze a large function body.

A bit of experimentation shows significant speed improvements are attainable on ARM64 NEON at least. So it is worth the effort to refactor this code that is not fully optimized by the compiler.

Even adding a dummy printf() statement runs the code faster (!) despite the overhead of IO. So yeah, compiler optimizations aren't kicking in a much as I want to at the moment. On a more serious note, this is not new to me. I taught several years of graduate level high-performance computing. I will more closely follow (my own) advice with the next release cycles. It's just work, not difficult to do.

With these optimizations and omitting line counting when possible, such as for option -c, when searching a 13GB file we can go from

$ time ugrep -c rol en.txt
1171415
        4.54 real         2.86 user         1.40 sys

to a much lower timing

$ time ugrep -c rol en.txt
1171415
        2.40 real         0.83 user         1.39 sys

which runs 90% faster on AArch64/NEON. Other search options will benefit anywhere from 20% to 100% speedup on AArch64/NEON. Because the compiler's register allocation, instruction scheduling and alias analysis are improved, I expect these changes will also speed up searching with SSE2/AVX2. A quick test confirms this, with the same runs on Intel MacOS giving a 15% speed up and a 90% speed up when searching for the word the.

Now I have to find time to work on this. Stay tuned!

The text was updated successfully, but these errors were encountered:

genivia-inc · 2024-04-29T19:11:05Z

OK, implemented and mostly tested over the weekend. Still some work to do. The executable is not larger, but faster. This update will be a lot faster on ARM devices that support NEON and AArch64.

updated SIMD algorithms
improved selection and specialization based on pattern characteristics
faster line counting, especially NEON/AArch64 is now super fast with new vector code that I came up with, including a fast alternative for vaddvq_s8 for horizontal vector addition on NEON
fix an obscure pattern match bug I found today in testing using a large generative test set I wrote some time ago to hit ugrep hard (that's how I found a bug in rg which I mention in one of my articles)

All should be ready by next week to release 6.0.

genivia-inc · 2024-05-06T20:16:49Z

The ugrep 6.0 benchmarks are already posted: https://github.com/Genivia/ugrep-benchmarks

This shows that ugrep is (one of) the fastest grep. Please note that no grep can (and should) absolutely claim to be always the fastest, because there are different algorithms involved with pros and cons.

Ugrep 6.0 will be released soon!

genivia-inc added the enhancement New feature or request label Apr 24, 2024

genivia-inc changed the title ~~[FR] Speeding search up by refactoring some code~~ [FR] Speed up search by refactoring some code Apr 25, 2024

genivia-inc changed the title ~~[FR] Speed up search by refactoring some code~~ [FR] 90% speed up by refactoring and optimizing some code Apr 25, 2024

genivia-inc closed this as completed May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] 90% speed up by refactoring and optimizing some code #385

[FR] 90% speed up by refactoring and optimizing some code #385

genivia-inc commented Apr 24, 2024 •

edited

Loading

genivia-inc commented Apr 29, 2024 •

edited

Loading

genivia-inc commented May 6, 2024

[FR] 90% speed up by refactoring and optimizing some code #385

[FR] 90% speed up by refactoring and optimizing some code #385

Comments

genivia-inc commented Apr 24, 2024 • edited Loading

genivia-inc commented Apr 29, 2024 • edited Loading

genivia-inc commented May 6, 2024

genivia-inc commented Apr 24, 2024 •

edited

Loading

genivia-inc commented Apr 29, 2024 •

edited

Loading