-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove hot calls to memmove
by replacing Vec<u8>
with VecDeque<u8>
#17
Remove hot calls to memmove
by replacing Vec<u8>
with VecDeque<u8>
#17
Conversation
Thats great! I'll take a look later today and merge this I wonder how I missed that. I seem to remember that the code taking the most time was the bitreader stuff |
I guess that might have been caused by me trying ruzstd on a 600 MB file compressed with |
The TODO around the If you don't want to do that I can merge this and do it myself when I find the time. The code is still correct, it is just being pretty inefficient in these cases. Otherwise just a few nitpicks,
So (imo) it should be f(slice1)?;
hash(slice1);
if !slice2.is_empty() {
f(slice2)?;
hash(slice2);
}
|
Done. Unfortunately The size for |
Ah right, I remembered why VecDequeue cant implement this nicely. I might play around with a custom buffer structure for this. We don't need all the features a VecDequeue provides. But that is for another day, I'd call this good enough. I too am not sure what a good size for that buffer is, but 4kb seems fine. One last thing just came to my mind, regarding the functions in drain_to. Right now it is like this: f(slice1)?;
hash(slice1);
if !slice2.is_empty() {
f(slice2)?;
hash(slice2);
}
drain(amount); When we commit slice1 to the hasher, we should also drain that. So when f(slice2) returns an error, we want to still call f(slice1)?;
hash(slice1);
if n2 != 0 {
let res = f(slice2);
let drain_amount;
if res.is_ok() {
hash(slice2);
drain_amount = n1+n2;
} else {
drain_amount = n1;
}
drain(drain_amount)
res
} else {
drain(n1);
Ok(())
} Sorry for doing another roundtrip but I think it's good to fix this in this PR too. The drain_to is a very good idea, and having it do the right thing when it's merged in would be nice :) |
I'll take a look at your proposal later in the day. This PR just went by in my notifications and it could be very cool to leverage this implementation in the future if it gets optimized more. rust-lang/rust#95632 |
Done. Using |
Very nice. Thank you! |
…=the8472 Add VecDeque::extend from vec::IntoIter and slice::Iter specializations Inspired from the [`Vec` `SpecExtend` implementation](https://github.com/rust-lang/rust/blob/027a232755fa9728e9699337267f6675dfd0a8ba/library/alloc/src/vec/spec_extend.rs), but without the specialization for `TrustedLen` which I'll look into in the future. Should help rust-lang#95632 and KillingSpark/zstd-rs#17 ## Benchmarks Before ``` test vec_deque::bench_extend_bytes ... bench: 862 ns/iter (+/- 10) test vec_deque::bench_extend_vec ... bench: 883 ns/iter (+/- 19) ``` After ``` test vec_deque::bench_extend_bytes ... bench: 8 ns/iter (+/- 0) test vec_deque::bench_extend_vec ... bench: 24 ns/iter (+/- 1) ```
Add VecDeque::extend from vec::IntoIter and slice::Iter specializations Inspired from the [`Vec` `SpecExtend` implementation](https://github.com/rust-lang/rust/blob/027a232755fa9728e9699337267f6675dfd0a8ba/library/alloc/src/vec/spec_extend.rs), but without the specialization for `TrustedLen` which I'll look into in the future. Should help #95632 and KillingSpark/zstd-rs#17 ## Benchmarks Before ``` test vec_deque::bench_extend_bytes ... bench: 862 ns/iter (+/- 10) test vec_deque::bench_extend_vec ... bench: 883 ns/iter (+/- 19) ``` After ``` test vec_deque::bench_extend_bytes ... bench: 8 ns/iter (+/- 0) test vec_deque::bench_extend_vec ... bench: 24 ns/iter (+/- 1) ```
This removes the need to constantly
drain
, which is one of the hottest functions on the flamegraph.From a quick benchmark on a large file decoding goes from about 10.5s to 3.5s. As a comparison zstd 1.4.8 takes 1.2s. We're getting closer 👀
Before
After