You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since the README file mentions a lot of performance-oriented things, I decided to test one compiler optimization - Profile-Guided Optimization (PGO) on genson-rs. I already tested it on various projects with positive results (you can find all benchmarks here: https://github.com/zamazan4ik/awesome-pgo), so here are the benchmark results for genson-rs.
Test environment
Fedora 39
Linux kernel 6.8.7
AMD Ryzen 9 5900x
48 Gib RAM
SSD Samsung 980 Pro 2 Tib
Compiler - Rustc 1.78
The project version: the latest for now from the main branch on commit 67afe6d3ad8d10affb65b251694ca7b52b978769
Disabled Turbo boost
Benchmark
For benchmark purposes, I use built-in into the project benchmarks. For PGO optimization I use cargo-pgo tool. Release bench result I got with the taskset -c 0 cargo bench command. The PGO training phase is done with taskset -c 0 cargo pgo bench, PGO optimization phase - with taskset -c 0 cargo pgo optimize bench.
All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee). taskset -c 0 is used for reducing OS scheduler "noise".
According to the results, PGO measurably improves the tool's performance at least in the benchmark above.
Further steps
I can suggest the following action points:
Perform more PGO benchmarks with other test files. If it shows improvements - add a note to the documentation (README file?) about possible improvements in the tool's performance with PGO.
Optimize prebuilt binaries with PGO (if any). As a training set, you can try to gather multiple real-life files, train PGO on them, and deliver pre-PGO-optimized binaries to the users.
Consider enabling Link-Time Optimization (LTO) for the tool. It can help with optimizing performance and reducing the binary size.
Testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.
I would be happy to answer your questions about PGO.
P.S. I created the Issue since Discussions are disabled for the repo. Since it's not the issue but an improvement idea, probably Discussions is a better place to discuss such things.
The text was updated successfully, but these errors were encountered:
Hi!
Since the README file mentions a lot of performance-oriented things, I decided to test one compiler optimization - Profile-Guided Optimization (PGO) on
genson-rs
. I already tested it on various projects with positive results (you can find all benchmarks here: https://github.com/zamazan4ik/awesome-pgo), so here are the benchmark results forgenson-rs
.Test environment
main
branch on commit67afe6d3ad8d10affb65b251694ca7b52b978769
Benchmark
For benchmark purposes, I use built-in into the project benchmarks. For PGO optimization I use cargo-pgo tool. Release bench result I got with the
taskset -c 0 cargo bench
command. The PGO training phase is done withtaskset -c 0 cargo pgo bench
, PGO optimization phase - withtaskset -c 0 cargo pgo optimize bench
.All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee).
taskset -c 0
is used for reducing OS scheduler "noise".Results
I got the following results:
According to the results, PGO measurably improves the tool's performance at least in the benchmark above.
Further steps
I can suggest the following action points:
Testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.
I would be happy to answer your questions about PGO.
P.S. I created the Issue since Discussions are disabled for the repo. Since it's not the issue but an improvement idea, probably Discussions is a better place to discuss such things.
The text was updated successfully, but these errors were encountered: