Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce external sequence producer API overhead by 25% #3471

Merged
merged 2 commits into from
Feb 2, 2023

Conversation

embg
Copy link
Contributor

@embg embg commented Jan 31, 2023

Adds a cctxParam to disable repcode search during external sequence parsing (only in explicit block delim mode for now, since that's where we currently care about parsing speed). For external matchfinders which don't explicitly search for repcode matches, this sacrifices less than 1% compression ratio on silesia.tar.

In general, the compression ratio trade-off is matchfinder- and data-dependent. Users should benchmark against their own data to determine if the trade-off is worth it. I have enabled by default below compression level 10, because the speed improvement is so great that I imagine few practical use-cases would gain enough ratio from disabling to justify disabling.

In the future, we might be able to use SIMD to run repcode search much faster, and change the trade-off such that enabling at low levels makes sense.

Overall external matchfinder API overhead (non-external-matchfinder compression CPU) is currently about 50% inside ZSTD_copySequencesToSeqStoreExplicitBlockDelim(), so this PR reduces overall overhead by about 25% (see perf numbers below).

Before:
Screenshot 2023-01-31 at 4 11 35 PM

After:
Screenshot 2023-01-31 at 4 12 56 PM

cc @daweiq @GarenJian-Intel

@embg embg merged commit 31e41b3 into facebook:dev Feb 2, 2023
@embg embg changed the title Reduce external matchfinder API overhead by 25% Reduce external sequence producer API overhead by 25% Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants