Reduce external sequence producer API overhead by 25% #3471
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds a cctxParam to disable repcode search during external sequence parsing (only in explicit block delim mode for now, since that's where we currently care about parsing speed). For external matchfinders which don't explicitly search for repcode matches, this sacrifices less than 1% compression ratio on silesia.tar.
In general, the compression ratio trade-off is matchfinder- and data-dependent. Users should benchmark against their own data to determine if the trade-off is worth it. I have enabled by default below compression level 10, because the speed improvement is so great that I imagine few practical use-cases would gain enough ratio from disabling to justify disabling.
In the future, we might be able to use SIMD to run repcode search much faster, and change the trade-off such that enabling at low levels makes sense.
Overall external matchfinder API overhead (non-external-matchfinder compression CPU) is currently about 50% inside
ZSTD_copySequencesToSeqStoreExplicitBlockDelim()
, so this PR reduces overall overhead by about 25% (see perf numbers below).Before:
After:
cc @daweiq @GarenJian-Intel