Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd: Remove offset from bitReader #854

Merged
merged 1 commit into from
Aug 18, 2023

Conversation

greatroar
Copy link
Contributor

@greatroar greatroar commented Aug 18, 2023

We can reslice instead of keeping a separate offset. This gets rid of some bounds checks. Also some other micro-optimizations to bit reading code. Combined results:

                                                     │   zstd/old   │              zstd/new               │
                                                     │     B/s      │     B/s       vs base               │
Decoder_DecoderSmall/kppkn.gtb.zst/buffered-8          427.6Mi ± 0%   428.2Mi ± 0%  +0.13% (p=0.019 n=10)
Decoder_DecoderSmall/kppkn.gtb.zst/unbuffered-8        511.6Mi ± 3%   516.9Mi ± 3%       ~ (p=0.280 n=10)
Decoder_DecoderSmall/geo.protodata.zst/buffered-8      1.110Gi ± 0%   1.110Gi ± 0%       ~ (p=0.165 n=10)
Decoder_DecoderSmall/geo.protodata.zst/unbuffered-8    824.7Mi ± 2%   827.3Mi ± 2%       ~ (p=0.481 n=10)
Decoder_DecoderSmall/plrabn12.txt.zst/buffered-8       330.4Mi ± 0%   330.3Mi ± 1%       ~ (p=0.645 n=10)
Decoder_DecoderSmall/plrabn12.txt.zst/unbuffered-8     533.3Mi ± 4%   538.8Mi ± 5%       ~ (p=0.393 n=10)
Decoder_DecoderSmall/lcet10.txt.zst/buffered-8         395.0Mi ± 0%   394.6Mi ± 0%  -0.10% (p=0.034 n=10)
Decoder_DecoderSmall/lcet10.txt.zst/unbuffered-8       556.5Mi ± 6%   546.2Mi ± 8%       ~ (p=0.436 n=10)
Decoder_DecoderSmall/asyoulik.txt.zst/buffered-8       342.2Mi ± 0%   342.2Mi ± 0%       ~ (p=0.956 n=10)
Decoder_DecoderSmall/asyoulik.txt.zst/unbuffered-8     436.7Mi ± 2%   435.4Mi ± 3%       ~ (p=0.739 n=10)
Decoder_DecoderSmall/alice29.txt.zst/buffered-8        335.6Mi ± 2%   337.0Mi ± 0%  +0.43% (p=0.000 n=10)
Decoder_DecoderSmall/alice29.txt.zst/unbuffered-8      552.6Mi ± 3%   550.7Mi ± 4%       ~ (p=1.000 n=10)
Decoder_DecoderSmall/html_x_4.zst/buffered-8           2.264Gi ± 0%   2.271Gi ± 0%  +0.29% (p=0.035 n=10)
Decoder_DecoderSmall/html_x_4.zst/unbuffered-8         1.558Gi ± 4%   1.554Gi ± 3%       ~ (p=0.579 n=10)
Decoder_DecoderSmall/paper-100k.pdf.zst/buffered-8     3.554Gi ± 5%   3.610Gi ± 0%  +1.59% (p=0.000 n=10)
Decoder_DecoderSmall/paper-100k.pdf.zst/unbuffered-8   1.701Gi ± 8%   1.709Gi ± 5%       ~ (p=0.631 n=10)
Decoder_DecoderSmall/fireworks.jpeg.zst/buffered-8     7.891Gi ± 4%   8.070Gi ± 0%  +2.26% (p=0.000 n=10)
Decoder_DecoderSmall/fireworks.jpeg.zst/unbuffered-8   3.062Gi ± 4%   3.129Gi ± 2%  +2.16% (p=0.002 n=10)
Decoder_DecoderSmall/urls.10K.zst/buffered-8           525.4Mi ± 6%   553.8Mi ± 0%  +5.39% (p=0.000 n=10)
Decoder_DecoderSmall/urls.10K.zst/unbuffered-8         763.7Mi ± 6%   819.7Mi ± 2%  +7.34% (p=0.000 n=10)
Decoder_DecoderSmall/html.zst/buffered-8               894.8Mi ± 0%   898.8Mi ± 2%  +0.45% (p=0.043 n=10)
Decoder_DecoderSmall/html.zst/unbuffered-8             722.3Mi ± 2%   717.7Mi ± 2%       ~ (p=0.912 n=10)
Decoder_DecoderSmall/comp-data.bin.zst/buffered-8      386.6Mi ± 2%   390.4Mi ± 0%  +1.00% (p=0.000 n=10)
Decoder_DecoderSmall/comp-data.bin.zst/unbuffered-8    145.2Mi ± 2%   148.7Mi ± 1%  +2.42% (p=0.003 n=10)
geomean                                                770.3Mi        777.5Mi       +0.93%

We can reslice instead of maintaining a separate offset. This gets rid of some bounds checks.

Also some other micro-optimizations to bit reading code. Combined results:

                                                     │   zstd/old   │              zstd/new               │
                                                     │     B/s      │     B/s       vs base               │
Decoder_DecoderSmall/kppkn.gtb.zst/buffered-8          427.6Mi ± 0%   428.2Mi ± 0%  +0.13% (p=0.019 n=10)
Decoder_DecoderSmall/kppkn.gtb.zst/unbuffered-8        511.6Mi ± 3%   516.9Mi ± 3%       ~ (p=0.280 n=10)
Decoder_DecoderSmall/geo.protodata.zst/buffered-8      1.110Gi ± 0%   1.110Gi ± 0%       ~ (p=0.165 n=10)
Decoder_DecoderSmall/geo.protodata.zst/unbuffered-8    824.7Mi ± 2%   827.3Mi ± 2%       ~ (p=0.481 n=10)
Decoder_DecoderSmall/plrabn12.txt.zst/buffered-8       330.4Mi ± 0%   330.3Mi ± 1%       ~ (p=0.645 n=10)
Decoder_DecoderSmall/plrabn12.txt.zst/unbuffered-8     533.3Mi ± 4%   538.8Mi ± 5%       ~ (p=0.393 n=10)
Decoder_DecoderSmall/lcet10.txt.zst/buffered-8         395.0Mi ± 0%   394.6Mi ± 0%  -0.10% (p=0.034 n=10)
Decoder_DecoderSmall/lcet10.txt.zst/unbuffered-8       556.5Mi ± 6%   546.2Mi ± 8%       ~ (p=0.436 n=10)
Decoder_DecoderSmall/asyoulik.txt.zst/buffered-8       342.2Mi ± 0%   342.2Mi ± 0%       ~ (p=0.956 n=10)
Decoder_DecoderSmall/asyoulik.txt.zst/unbuffered-8     436.7Mi ± 2%   435.4Mi ± 3%       ~ (p=0.739 n=10)
Decoder_DecoderSmall/alice29.txt.zst/buffered-8        335.6Mi ± 2%   337.0Mi ± 0%  +0.43% (p=0.000 n=10)
Decoder_DecoderSmall/alice29.txt.zst/unbuffered-8      552.6Mi ± 3%   550.7Mi ± 4%       ~ (p=1.000 n=10)
Decoder_DecoderSmall/html_x_4.zst/buffered-8           2.264Gi ± 0%   2.271Gi ± 0%  +0.29% (p=0.035 n=10)
Decoder_DecoderSmall/html_x_4.zst/unbuffered-8         1.558Gi ± 4%   1.554Gi ± 3%       ~ (p=0.579 n=10)
Decoder_DecoderSmall/paper-100k.pdf.zst/buffered-8     3.554Gi ± 5%   3.610Gi ± 0%  +1.59% (p=0.000 n=10)
Decoder_DecoderSmall/paper-100k.pdf.zst/unbuffered-8   1.701Gi ± 8%   1.709Gi ± 5%       ~ (p=0.631 n=10)
Decoder_DecoderSmall/fireworks.jpeg.zst/buffered-8     7.891Gi ± 4%   8.070Gi ± 0%  +2.26% (p=0.000 n=10)
Decoder_DecoderSmall/fireworks.jpeg.zst/unbuffered-8   3.062Gi ± 4%   3.129Gi ± 2%  +2.16% (p=0.002 n=10)
Decoder_DecoderSmall/urls.10K.zst/buffered-8           525.4Mi ± 6%   553.8Mi ± 0%  +5.39% (p=0.000 n=10)
Decoder_DecoderSmall/urls.10K.zst/unbuffered-8         763.7Mi ± 6%   819.7Mi ± 2%  +7.34% (p=0.000 n=10)
Decoder_DecoderSmall/html.zst/buffered-8               894.8Mi ± 0%   898.8Mi ± 2%  +0.45% (p=0.043 n=10)
Decoder_DecoderSmall/html.zst/unbuffered-8             722.3Mi ± 2%   717.7Mi ± 2%       ~ (p=0.912 n=10)
Decoder_DecoderSmall/comp-data.bin.zst/buffered-8      386.6Mi ± 2%   390.4Mi ± 0%  +1.00% (p=0.000 n=10)
Decoder_DecoderSmall/comp-data.bin.zst/unbuffered-8    145.2Mi ± 2%   148.7Mi ± 1%  +2.42% (p=0.003 n=10)
geomean                                                770.3Mi        777.5Mi       +0.93%
Copy link
Owner

@klauspost klauspost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼 Looks good! Thanks!

@klauspost klauspost merged commit 0836a1c into klauspost:master Aug 18, 2023
18 checks passed
kodiakhq bot referenced this pull request in cloudquery/filetypes Oct 1, 2023
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/klauspost/compress](https://github.com/klauspost/compress) | indirect | minor | `v1.16.7` -> `v1.17.0` |

---

### Release Notes

<details>
<summary>klauspost/compress (github.com/klauspost/compress)</summary>

### [`v1.17.0`](https://github.com/klauspost/compress/releases/tag/v1.17.0)

[Compare Source](https://github.com/klauspost/compress/compare/v1.16.7...v1.17.0)

#### What's Changed

-   Add dictionary builder by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/853](https://github.com/klauspost/compress/pull/853)
-   Add xerial snappy read/writer by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/838](https://github.com/klauspost/compress/pull/838)
-   flate: Add limited window compression by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/843](https://github.com/klauspost/compress/pull/843)
-   s2: Do 2 overlapping match checks by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/839](https://github.com/klauspost/compress/pull/839)
-   flate: Add amd64 assembly matchlen by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/837](https://github.com/klauspost/compress/pull/837)
-   gzip: Copy bufio.Reader on Reset by [@&#8203;thatguystone](https://github.com/thatguystone) in [https://github.com/klauspost/compress/pull/860](https://github.com/klauspost/compress/pull/860)
-   zstd: Remove offset from bitReader by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/854](https://github.com/klauspost/compress/pull/854)
-   fse, huff0, zstd: Remove always-nil error returns by [@&#8203;greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/857](https://github.com/klauspost/compress/pull/857)
-   tests: unnecessary use of fmt.Sprintf by [@&#8203;testwill](https://github.com/testwill) in [https://github.com/klauspost/compress/pull/836](https://github.com/klauspost/compress/pull/836)
-   tests: Fix OSS fuzzer t.Run by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/852](https://github.com/klauspost/compress/pull/852)
-   tests: Use Go 1.21.x by [@&#8203;klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/851](https://github.com/klauspost/compress/pull/851)

#### New Contributors

-   [@&#8203;testwill](https://github.com/testwill) made their first contribution in [https://github.com/klauspost/compress/pull/836](https://github.com/klauspost/compress/pull/836)
-   [@&#8203;thatguystone](https://github.com/thatguystone) made their first contribution in [https://github.com/klauspost/compress/pull/860](https://github.com/klauspost/compress/pull/860)

**Full Changelog**: klauspost/compress@v1.16.7...v1.17.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 4am on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi4xMDkuNCIsInVwZGF0ZWRJblZlciI6IjM2LjEwOS40IiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants