Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lowLimit underflow in overflow correction #1957

Merged
merged 2 commits into from
Jan 18, 2020

Conversation

terrelln
Copy link
Contributor

@terrelln terrelln commented Jan 17, 2020

After PR #1624 we no longer updated lowLimit every block. That means lowLimit only gets updated when the round buffer overlaps in single threaded mode, or when we start a new job in multithreaded mode. After that change (and maybe before too), lowLimit can underflow. If lowLimit underflows, then for the remainder of compression all matches are deemed out of bounds, so compression ratio plummets.

This fixes the problem by ensuring lowLimit never underflows. We set lowLimit and dictLimit to 1 instead, and ensure that we aren't invalidating any of the window.

I've modified two tests in playTests.sh to trigger overflow correction. Currently they don't because after PR #1658 we clear the context instead of overflow correction if we are starting within 16 MB of the correction point. Setting a larger window log ensures a larger job size, which doesn't fall within 16 MB of the correction point.

enwik10 now compresses as expected:

> ./zstd enwik10 --ultra -22 -cv | zstd -tq
enwik10              : 20.80%   (10000000000 => 2079998491 bytes, /*stdout*\)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants