Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve .extend() performance #112

Merged
merged 2 commits into from
Nov 28, 2018
Merged

Improve .extend() performance #112

merged 2 commits into from
Nov 28, 2018

Conversation

bluss
Copy link
Owner

@bluss bluss commented Nov 25, 2018

cc #101

I'm experimenting with different formulations. Both the existing and this pr's implementations can compile into a memcpy, like they should, if the input is a cloned slice iterator. The main difficulty seems to be to make a good benchmark. Without "black_box"es, the benchmarks compile out (and that's normally a good sign in itself, the code is then transparent to the optimizer) and with too many black box calls, the optimizations are disabled.

@bluss bluss changed the title Improve .extend() performance Improve .extend() performance (?) Nov 25, 2018
@bluss
Copy link
Owner Author

bluss commented Nov 26, 2018

Not entirely happy with these benchmarks either, see code in the PR, but they seem fair(? please review)

 name                  63 ns/iter       62 ns/iter       diff ns/iter   diff % 
 extend_with_constant  294 (1741 MB/s)  1 (512000 MB/s)          -293  -99.66% 
 extend_with_range     426 (1201 MB/s)  289 (1771 MB/s)          -137  -32.16% 
 extend_with_slice     424 (1207 MB/s)  13 (39384 MB/s)          -411  -96.93% 
 extend_with_write     13 (39384 MB/s)  13 (39384 MB/s)             0    0.00%

obviously when extend_with_constant optimizes out it doesn't tell us so much, except that the new extend code is somehow more transparent to the compiler than the old.

@bluss
Copy link
Owner Author

bluss commented Nov 28, 2018

Comparison with try_extend_from_slice shows that they both compile to memcpy:

test extend_from_slice    ... bench:          14 ns/iter (+/- 1) = 36571 MB/s
test extend_with_slice    ... bench:          13 ns/iter (+/- 1) = 39384 MB/s

extend_with_slice is the regular extend() used with a slice iterator (benchmark is in the PR).

@bluss bluss merged commit ef7ab56 into master Nov 28, 2018
@bluss bluss changed the title Improve .extend() performance (?) Improve .extend() performance Nov 28, 2018
@bluss bluss deleted the extend-improvement branch November 28, 2018 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant