Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd: Add Best compression mode #304

Merged
merged 5 commits into from
Dec 19, 2020
Merged

Conversation

klauspost
Copy link
Owner

@klauspost klauspost commented Dec 17, 2020

Add slow compression mode, but with the best compression ratio available

Allows a nice size improvement over "better" mode, but at a significant CPU penalty.

Typical speed is currently around 25MB/s

Some speed examples:

file		 		comp	level	insize	outsize	millis	mb/s
github-june-2days-2019.json	zskp	1	6273951764	699045015	11634	514.29
github-june-2days-2019.json	zskp	2	6273951764	617881763	17756	336.96
github-june-2days-2019.json	zskp	3	6273951764	537511906	35785	167.20
github-june-2days-2019.json	zskp	4	6273951764	512796117	103513	57.80

silesia.tar	zskp	1	211947520	73118028	713	283.09
silesia.tar	zskp	2	211947520	67504318	1049	192.50
silesia.tar	zskp	3	211947520	65102964	2466	81.93
silesia.tar	zskp	4	211947520	61381950	8115	24.91

enwik8	zskp	1	100000000	39176328	408	233.17
enwik8	zskp	2	100000000	36036946	623	152.83
enwik8	zskp	3	100000000	33583681	1551	61.47
enwik8	zskp	4	100000000	31529660	3817	24.98

TS40.txt	zskp	1	400000000	156408033	1857	205.42
TS40.txt	zskp	2	400000000	144331263	2713	140.61
TS40.txt	zskp	3	400000000	135435550	5997	63.61
TS40.txt	zskp	4	400000000	127940230	15748	24.22

So far fairly naiive implementation, but offers decent improvements.

Longer term it should probably have chaining replace some of the alternative scans. Some contribute very little.

Add slow compression mode, but with the best compression ratio available

Allows a nice size improvement over "better" mode, but at a significant CPU penalty.

Typical speed is currently around 20MB/s

Some speed examples:
```
github-june-2days-2019.json	zskp	1	6273951764	699045015	11634	514.29
github-june-2days-2019.json	zskp	2	6273951764	617881763	17756	336.96
github-june-2days-2019.json	zskp	3	6273951764	537511906	35785	167.20
github-june-2days-2019.json	zskp	4	6273951764	518319822	102880	58.16

silesia.tar	zskp	1	211947520	73118028	713	283.09
silesia.tar	zskp	2	211947520	67504318	1049	192.50
silesia.tar	zskp	3	211947520	65102964	2466	81.93
silesia.tar	zskp	4	211947520	62790088	8498	23.78

enwik8	zskp	1	100000000	39176328	408	233.17
enwik8	zskp	2	100000000	36036946	623	152.83
enwik8	zskp	3	100000000	33583681	1551	61.47
enwik8	zskp	4	100000000	31601631	4421	21.57

TS40.txt	zskp	1	400000000	156408033	1857	205.42
TS40.txt	zskp	2	400000000	144331263	2713	140.61
TS40.txt	zskp	3	400000000	135435550	5997	63.61
TS40.txt	zskp	4	400000000	127882512	18920	20.16
```
@klauspost klauspost merged commit 156c8d0 into master Dec 19, 2020
@klauspost klauspost deleted the zstd-add-best-compression branch December 19, 2020 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant