Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Optimize position search in error path
Translating index into a line/column pair takes considerable time. Notably, the JSON benchmark modified to run on malformed data spends around 50% of the CPU time generating the error object. While it is generally assumed that the cold path is quite slow, such a drastic pessimization may be unexpected, especially when a faster implementation exists. Using vectorized routines provided by the memchr crate increases performance of the failure path by 2x on average. Old implementation: DOM STRUCT data/canada.json 122 MB/s 168 MB/s data/citm_catalog.json 135 MB/s 195 MB/s data/twitter.json 142 MB/s 226 MB/s New implementation: DOM STRUCT data/canada.json 216 MB/s 376 MB/s data/citm_catalog.json 238 MB/s 736 MB/s data/twitter.json 210 MB/s 492 MB/s In comparison, the performance of the happy path is: DOM STRUCT data/canada.json 283 MB/s 416 MB/s data/citm_catalog.json 429 MB/s 864 MB/s data/twitter.json 275 MB/s 541 MB/s While this introduces a new dependency, memchr is much faster to compile than serde, so compile time does not increase significantly. Additionally, memchr provides a more efficient SWAR-based implementation of both the memchr and count routines even without std, providing benefits for embedded uses as well.
- Loading branch information