Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ebml::reader::vuint_at() #11498

Merged
merged 4 commits into from
Jan 17, 2014
Merged

Conversation

c-a
Copy link
Contributor

@c-a c-a commented Jan 12, 2014

Use a lookup table, SHIFT_MASK_TABLE, that for every possible four
bit prefix holds the number of times the value should be right shifted and what
the right shifted value should be masked with. This way we can get rid of the
branches which in my testing gives approximately a 2x speedup.

Timings on Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz

-- Before --
running 5 tests
test ebml::tests::test_vuint_at ... ok
test ebml::bench::vuint_at_A_aligned ... bench: 494 ns/iter (+/- 3)
test ebml::bench::vuint_at_A_unaligned ... bench: 494 ns/iter (+/- 4)
test ebml::bench::vuint_at_D_aligned ... bench: 467 ns/iter (+/- 5)
test ebml::bench::vuint_at_D_unaligned ... bench: 467 ns/iter (+/- 5)

-- After --
running 5 tests
test ebml::tests::test_vuint_at ... ok
test ebml::bench::vuint_at_A_aligned ... bench: 181 ns/iter (+/- 2)
test ebml::bench::vuint_at_A_unaligned ... bench: 192 ns/iter (+/- 1)
test ebml::bench::vuint_at_D_aligned ... bench: 181 ns/iter (+/- 3)
test ebml::bench::vuint_at_D_unaligned ... bench: 197 ns/iter (+/- 6)

Since reader::vuint_at() returns a result of type reader::Res it makes sense
to make it public.

Due to rust's current behavior of externally referenced private structures,
rust-lang#10573, you could still use the result and
assign it to a variable if you let the compiler do the type assignment,
but you could not explicitly annotate a variable to hold a reader::Res.
@@ -130,32 +130,32 @@ pub mod reader {
return vuint_at_slow(data, start);
}

static shift_table: [uint, ..16] = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conventionally these would be SHIFT_TABLE and MASK_TABLE.

@c-a
Copy link
Contributor Author

c-a commented Jan 12, 2014

New measurement storing mask and shift in a single table:
-- (u32, u32) tuple --
running 5 tests
test ebml::tests::test_vuint_at ... ok
test ebml::bench::vuint_at_A_aligned ... bench: 181 ns/iter (+/- 2)
test ebml::bench::vuint_at_A_unaligned ... bench: 192 ns/iter (+/- 1)
test ebml::bench::vuint_at_D_aligned ... bench: 181 ns/iter (+/- 3)
test ebml::bench::vuint_at_D_unaligned ... bench: 197 ns/iter (+/- 6)

And for academic interest without the bound check at the top of the function:
-- Without fallback to vuint_slow --
running 5 tests
test ebml::tests::test_vuint_at ... ok
test ebml::bench::vuint_at_A_aligned ... bench: 44 ns/iter (+/- 1)
test ebml::bench::vuint_at_A_unaligned ... bench: 40 ns/iter (+/- 1)
test ebml::bench::vuint_at_D_aligned ... bench: 40 ns/iter (+/- 1)
test ebml::bench::vuint_at_D_unaligned ... bench: 44 ns/iter (+/- 1)

@alexcrichton
Copy link
Member

This looks pretty awesome, thanks!

Could you add some comments to the lookup table as to why it exists and what the values/rows signify?

Use a lookup table, SHIFT_MASK_TABLE, that for every possible four
bit prefix holds the number of times the value should be right shifted and what
the right shifted value should be masked with. This way we can get rid of the
branches which in my testing gives approximately a 2x speedup.
@brson
Copy link
Contributor

brson commented Jan 13, 2014

Very cool. What kind of impact does this have on rustc compiles?

@brson
Copy link
Contributor

brson commented Jan 13, 2014

You may also be interested in #9303

bors added a commit that referenced this pull request Jan 17, 2014
Use a lookup table, SHIFT_MASK_TABLE, that for every possible four
bit prefix holds the number of times the value should be right shifted and what
the right shifted value should be masked with. This way we can get rid of the
branches which in my testing gives approximately a 2x speedup.

Timings on Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz

-- Before --
running 5 tests
test ebml::tests::test_vuint_at ... ok
test ebml::bench::vuint_at_A_aligned          ... bench:       494 ns/iter (+/- 3)
test ebml::bench::vuint_at_A_unaligned        ... bench:       494 ns/iter (+/- 4)
test ebml::bench::vuint_at_D_aligned          ... bench:       467 ns/iter (+/- 5)
test ebml::bench::vuint_at_D_unaligned        ... bench:       467 ns/iter (+/- 5)

-- After --
running 5 tests
test ebml::tests::test_vuint_at ... ok
test ebml::bench::vuint_at_A_aligned ... bench: 181 ns/iter (+/- 2)
test ebml::bench::vuint_at_A_unaligned ... bench: 192 ns/iter (+/- 1)
test ebml::bench::vuint_at_D_aligned ... bench: 181 ns/iter (+/- 3)
test ebml::bench::vuint_at_D_unaligned ... bench: 197 ns/iter (+/- 6)
@bors bors closed this Jan 17, 2014
@bors bors merged commit f4c9ed4 into rust-lang:master Jan 17, 2014
flip1995 pushed a commit to flip1995/rust that referenced this pull request Nov 2, 2023
…affects_lint, r=Centri3

fix enum_variant_names depending lint depending on order

changelog: [`enum_variant_names`]: fix single word variants preventing lint of later variant pre/postfixed with the enum name

fixes rust-lang#11494

Single word variants prevented checking the `check_enum_start` and `check_enum_end` for being run on later variants
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants