Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cargo.toml: use codegen-units = 1 in release and bench profiles. #1192

Merged
merged 1 commit into from
Feb 25, 2018

Conversation

matthiaskrgr
Copy link
Contributor

@matthiaskrgr matthiaskrgr commented Feb 24, 2018

lto = true no longer implies codegen-units = 1 in nightly, however several CGUs may prevent some optimizations, thus force codegen-units to 1 in release and bench profiles.

cargo bench pre-patch:


     Running target/release/deps/01_default-e8a3ee34fccb9d17

running 2 tests
test build_app   ... bench:          65 ns/iter (+/- 2)
test parse_clean ... bench:         505 ns/iter (+/- 11)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured; 0 filtered out

     Running target/release/deps/02_simple-7f768641c259a9fc

running 12 tests
test add_flag         ... bench:         197 ns/iter (+/- 5)
test add_flag_ref     ... bench:         229 ns/iter (+/- 20)
test add_opt          ... bench:         272 ns/iter (+/- 18)
test add_opt_ref      ... bench:         330 ns/iter (+/- 6)
test add_pos          ... bench:         224 ns/iter (+/- 5)
test add_pos_ref      ... bench:         265 ns/iter (+/- 2)
test build_app        ... bench:         857 ns/iter (+/- 47)
test parse_clean      ... bench:       1,597 ns/iter (+/- 117)
test parse_complex    ... bench:       3,093 ns/iter (+/- 253)
test parse_flag       ... bench:       2,052 ns/iter (+/- 80)
test parse_option     ... bench:       2,275 ns/iter (+/- 80)
test parse_positional ... bench:       2,127 ns/iter (+/- 129)

test result: ok. 0 passed; 0 failed; 0 ignored; 12 measured; 0 filtered out

     Running target/release/deps/03_complex-799c01e17c9124c8

running 15 tests
test create_app_builder                  ... bench:       2,962 ns/iter (+/- 151)
test create_app_from_usage               ... bench:       3,842 ns/iter (+/- 168)
test create_app_macros                   ... bench:       2,892 ns/iter (+/- 49)
test parse_clean                         ... bench:       4,875 ns/iter (+/- 180)
test parse_complex1                      ... bench:       9,166 ns/iter (+/- 181)
test parse_complex2                      ... bench:      10,262 ns/iter (+/- 1,125)
test parse_complex2_with_args_negate_scs ... bench:       9,995 ns/iter (+/- 2,032)
test parse_flag                          ... bench:       5,596 ns/iter (+/- 186)
test parse_option                        ... bench:       5,850 ns/iter (+/- 790)
test parse_positional                    ... bench:       5,855 ns/iter (+/- 498)
test parse_sc_clean                      ... bench:       6,476 ns/iter (+/- 191)
test parse_sc_complex                    ... bench:       8,259 ns/iter (+/- 262)
test parse_sc_flag                       ... bench:       7,185 ns/iter (+/- 165)
test parse_sc_option                     ... bench:       7,181 ns/iter (+/- 279)
test parse_sc_positional                 ... bench:       6,985 ns/iter (+/- 150)

test result: ok. 0 passed; 0 failed; 0 ignored; 15 measured; 0 filtered out

     Running target/release/deps/04_new_help-061eb7a6c787297b

running 10 tests
test example1          ... bench:      15,247 ns/iter (+/- 466)
test example10         ... bench:       6,105 ns/iter (+/- 130)
test example2          ... bench:       2,112 ns/iter (+/- 752)
test example3          ... bench:      14,484 ns/iter (+/- 527)
test example4          ... bench:       8,725 ns/iter (+/- 140)
test example4_template ... bench:       9,092 ns/iter (+/- 254)
test example5          ... bench:       5,383 ns/iter (+/- 135)
test example6          ... bench:       4,469 ns/iter (+/- 182)
test example7          ... bench:       8,443 ns/iter (+/- 274)
test example8          ... bench:       8,448 ns/iter (+/- 453)

test result: ok. 0 passed; 0 failed; 0 ignored; 10 measured; 0 filtered out

     Running target/release/deps/05_ripgrep-77cdada62ffe8049

running 7 tests
test build_app_long   ... bench:      13,207 ns/iter (+/- 297)
test build_app_short  ... bench:      13,187 ns/iter (+/- 137)
test build_help_long  ... bench:     191,834 ns/iter (+/- 11,202)
test build_help_short ... bench:      91,475 ns/iter (+/- 1,159)
test parse_clean      ... bench:      15,361 ns/iter (+/- 217)
test parse_complex    ... bench:      22,615 ns/iter (+/- 8,019)
test parse_lots       ... bench:     382,522 ns/iter (+/- 12,048)

test result: ok. 0 passed; 0 failed; 0 ignored; 7 measured; 0 filtered out

     Running target/release/deps/06_rustup-9eb8997af5eb304a

running 3 tests
test build_app         ... bench:      14,891 ns/iter (+/- 310)
test parse_clean       ... bench:      17,349 ns/iter (+/- 1,938)
test parse_subcommands ... bench:      17,510 ns/iter (+/- 3,767)

test result: ok. 0 passed; 0 failed; 0 ignored; 3 measured; 0 filtered out

cargo bench post-patch:

     Running target/release/deps/01_default-a3fff6cc41f7cd18

running 2 tests
test build_app   ... bench:          61 ns/iter (+/- 3)
test parse_clean ... bench:         485 ns/iter (+/- 16)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured; 0 filtered out

     Running target/release/deps/02_simple-c31470c7956fab22

running 12 tests
test add_flag         ... bench:         178 ns/iter (+/- 3)
test add_flag_ref     ... bench:         214 ns/iter (+/- 4)
test add_opt          ... bench:         249 ns/iter (+/- 5)
test add_opt_ref      ... bench:         315 ns/iter (+/- 2)
test add_pos          ... bench:         220 ns/iter (+/- 4)
test add_pos_ref      ... bench:         261 ns/iter (+/- 7)
test build_app        ... bench:         827 ns/iter (+/- 10)
test parse_clean      ... bench:       1,581 ns/iter (+/- 55)
test parse_complex    ... bench:       2,979 ns/iter (+/- 76)
test parse_flag       ... bench:       2,002 ns/iter (+/- 49)
test parse_option     ... bench:       2,263 ns/iter (+/- 51)
test parse_positional ... bench:       2,128 ns/iter (+/- 137)

test result: ok. 0 passed; 0 failed; 0 ignored; 12 measured; 0 filtered out

     Running target/release/deps/03_complex-7b0c3062ef03e171

running 15 tests
test create_app_builder                  ... bench:       2,888 ns/iter (+/- 104)
test create_app_from_usage               ... bench:       3,762 ns/iter (+/- 60)
test create_app_macros                   ... bench:       2,732 ns/iter (+/- 110)
test parse_clean                         ... bench:       4,862 ns/iter (+/- 97)
test parse_complex1                      ... bench:       9,493 ns/iter (+/- 2,559)
test parse_complex2                      ... bench:      10,110 ns/iter (+/- 460)
test parse_complex2_with_args_negate_scs ... bench:       9,965 ns/iter (+/- 1,339)
test parse_flag                          ... bench:       5,512 ns/iter (+/- 205)
test parse_option                        ... bench:       5,800 ns/iter (+/- 243)
test parse_positional                    ... bench:       5,801 ns/iter (+/- 111)
test parse_sc_clean                      ... bench:       6,358 ns/iter (+/- 821)
test parse_sc_complex                    ... bench:       8,012 ns/iter (+/- 439)
test parse_sc_flag                       ... bench:       7,087 ns/iter (+/- 231)
test parse_sc_option                     ... bench:       7,393 ns/iter (+/- 420)
test parse_sc_positional                 ... bench:       6,756 ns/iter (+/- 228)

test result: ok. 0 passed; 0 failed; 0 ignored; 15 measured; 0 filtered out

     Running target/release/deps/04_new_help-8e7ea3ef33e99347

running 10 tests
test example1          ... bench:      15,744 ns/iter (+/- 823)
test example10         ... bench:       5,923 ns/iter (+/- 373)
test example2          ... bench:       2,109 ns/iter (+/- 255)
test example3          ... bench:      14,431 ns/iter (+/- 1,462)
test example4          ... bench:       8,705 ns/iter (+/- 295)
test example4_template ... bench:       9,048 ns/iter (+/- 361)
test example5          ... bench:       5,173 ns/iter (+/- 334)
test example6          ... bench:       4,444 ns/iter (+/- 953)
test example7          ... bench:       8,070 ns/iter (+/- 795)
test example8          ... bench:       8,055 ns/iter (+/- 205)

test result: ok. 0 passed; 0 failed; 0 ignored; 10 measured; 0 filtered out

     Running target/release/deps/05_ripgrep-0564c3266bd81ef0

running 7 tests
test build_app_long   ... bench:      12,683 ns/iter (+/- 424)
test build_app_short  ... bench:      12,820 ns/iter (+/- 467)
test build_help_long  ... bench:     226,031 ns/iter (+/- 4,502)
test build_help_short ... bench:     100,864 ns/iter (+/- 1,933)
test parse_clean      ... bench:      14,194 ns/iter (+/- 383)
test parse_complex    ... bench:      21,188 ns/iter (+/- 258)
test parse_lots       ... bench:     390,230 ns/iter (+/- 5,028)

test result: ok. 0 passed; 0 failed; 0 ignored; 7 measured; 0 filtered out

     Running target/release/deps/06_rustup-202350efb773e7a2

running 3 tests
test build_app         ... bench:      15,119 ns/iter (+/- 424)
test parse_clean       ... bench:      17,213 ns/iter (+/- 512)
test parse_subcommands ... bench:      17,355 ns/iter (+/- 373)

test result: ok. 0 passed; 0 failed; 0 ignored; 3 measured; 0 filtered out


This change is Reviewable

@mention-bot
Copy link

@matthiaskrgr, thanks for your PR! By analyzing the history of the files in this pull request, we identified @kbknapp to be a potential reviewer.

@kbknapp
Copy link
Member

kbknapp commented Feb 25, 2018

Interesting, I've only been vaugly following the CGU changes. The pre/post benches seem to suggest the changes we decrease performance though?

@matthiaskrgr
Copy link
Contributor Author

Hm, the way I read it is ns/iter => nanoseconds per iteration, so if the number gets lower we gain performance.

ripgrep
test build_help_long
test build_help_short

seemed to regress, do the build tests actually call a compiler?

@kbknapp
Copy link
Member

kbknapp commented Feb 25, 2018

Wow, I'm not sure what exactly I was looking at, you're correct 😜

Just for SA here's a merged view of them:

Merged Pre/Post Bench

post add_flag         ... bench:         178 ns/iter (+/- 3)
pre  add_flag         ... bench:         197 ns/iter (+/- 5)
post add_flag_ref     ... bench:         214 ns/iter (+/- 4)
pre  add_flag_ref     ... bench:         229 ns/iter (+/- 20)
post add_opt          ... bench:         249 ns/iter (+/- 5)
pre  add_opt          ... bench:         272 ns/iter (+/- 18)
post add_opt_ref      ... bench:         315 ns/iter (+/- 2)
pre  add_opt_ref      ... bench:         330 ns/iter (+/- 6)
post add_pos          ... bench:         220 ns/iter (+/- 4)
pre  add_pos          ... bench:         224 ns/iter (+/- 5)
post add_pos_ref      ... bench:         261 ns/iter (+/- 7)
pre  add_pos_ref      ... bench:         265 ns/iter (+/- 2)
pre  build_app         ... bench:      14,891 ns/iter (+/- 310)
post build_app         ... bench:      15,119 ns/iter (+/- 424)
post build_app   ... bench:          61 ns/iter (+/- 3)
pre  build_app   ... bench:          65 ns/iter (+/- 2)
post build_app        ... bench:         827 ns/iter (+/- 10)
pre  build_app        ... bench:         857 ns/iter (+/- 47)
post build_app_long   ... bench:      12,683 ns/iter (+/- 424)
pre  build_app_long   ... bench:      13,207 ns/iter (+/- 297)
post build_app_short  ... bench:      12,820 ns/iter (+/- 467)
pre  build_app_short  ... bench:      13,187 ns/iter (+/- 137)
pre  build_help_long  ... bench:     191,834 ns/iter (+/- 11,202)
post build_help_long  ... bench:     226,031 ns/iter (+/- 4,502)
post build_help_short ... bench:     100,864 ns/iter (+/- 1,933)
pre  build_help_short ... bench:      91,475 ns/iter (+/- 1,159)
post create_app_builder                  ... bench:       2,888 ns/iter (+/- 104)
pre  create_app_builder                  ... bench:       2,962 ns/iter (+/- 151)
post create_app_from_usage               ... bench:       3,762 ns/iter (+/- 60)
pre  create_app_from_usage               ... bench:       3,842 ns/iter (+/- 168)
post create_app_macros                   ... bench:       2,732 ns/iter (+/- 110)
pre  create_app_macros                   ... bench:       2,892 ns/iter (+/- 49)
post example10         ... bench:       5,923 ns/iter (+/- 373)
pre  example10         ... bench:       6,105 ns/iter (+/- 130)
pre  example1          ... bench:      15,247 ns/iter (+/- 466)
post example1          ... bench:      15,744 ns/iter (+/- 823)
post example2          ... bench:       2,109 ns/iter (+/- 255)
pre  example2          ... bench:       2,112 ns/iter (+/- 752)
post example3          ... bench:      14,431 ns/iter (+/- 1,462)
pre  example3          ... bench:      14,484 ns/iter (+/- 527)
post example4          ... bench:       8,705 ns/iter (+/- 295)
pre  example4          ... bench:       8,725 ns/iter (+/- 140)
post example4_template ... bench:       9,048 ns/iter (+/- 361)
pre  example4_template ... bench:       9,092 ns/iter (+/- 254)
post example5          ... bench:       5,173 ns/iter (+/- 334)
pre  example5          ... bench:       5,383 ns/iter (+/- 135)
post example6          ... bench:       4,444 ns/iter (+/- 953)
pre  example6          ... bench:       4,469 ns/iter (+/- 182)
post example7          ... bench:       8,070 ns/iter (+/- 795)
pre  example7          ... bench:       8,443 ns/iter (+/- 274)
post example8          ... bench:       8,055 ns/iter (+/- 205)
pre  example8          ... bench:       8,448 ns/iter (+/- 453)
post parse_clean      ... bench:      14,194 ns/iter (+/- 383)
pre  parse_clean      ... bench:      15,361 ns/iter (+/- 217)
post parse_clean      ... bench:       1,581 ns/iter (+/- 55)
pre  parse_clean      ... bench:       1,597 ns/iter (+/- 117)
post parse_clean       ... bench:      17,213 ns/iter (+/- 512)
pre  parse_clean       ... bench:      17,349 ns/iter (+/- 1,938)
post parse_clean ... bench:         485 ns/iter (+/- 16)
post parse_clean                         ... bench:       4,862 ns/iter (+/- 97)
pre  parse_clean                         ... bench:       4,875 ns/iter (+/- 180)
pre  parse_clean ... bench:         505 ns/iter (+/- 11)
pre  parse_complex1                      ... bench:       9,166 ns/iter (+/- 181)
post parse_complex1                      ... bench:       9,493 ns/iter (+/- 2,559)
post parse_complex2                      ... bench:      10,110 ns/iter (+/- 460)
pre  parse_complex2                      ... bench:      10,262 ns/iter (+/- 1,125)
post parse_complex2_with_args_negate_scs ... bench:       9,965 ns/iter (+/- 1,339)
pre  parse_complex2_with_args_negate_scs ... bench:       9,995 ns/iter (+/- 2,032)
post parse_complex    ... bench:      21,188 ns/iter (+/- 258)
pre  parse_complex    ... bench:      22,615 ns/iter (+/- 8,019)
post parse_complex    ... bench:       2,979 ns/iter (+/- 76)
pre  parse_complex    ... bench:       3,093 ns/iter (+/- 253)
post parse_flag       ... bench:       2,002 ns/iter (+/- 49)
pre  parse_flag       ... bench:       2,052 ns/iter (+/- 80)
post parse_flag                          ... bench:       5,512 ns/iter (+/- 205)
pre  parse_flag                          ... bench:       5,596 ns/iter (+/- 186)
pre  parse_lots       ... bench:     382,522 ns/iter (+/- 12,048)
post parse_lots       ... bench:     390,230 ns/iter (+/- 5,028)
post parse_option     ... bench:       2,263 ns/iter (+/- 51)
pre  parse_option     ... bench:       2,275 ns/iter (+/- 80)
post parse_option                        ... bench:       5,800 ns/iter (+/- 243)
pre  parse_option                        ... bench:       5,850 ns/iter (+/- 790)
pre  parse_positional ... bench:       2,127 ns/iter (+/- 129)
post parse_positional ... bench:       2,128 ns/iter (+/- 137)
post parse_positional                    ... bench:       5,801 ns/iter (+/- 111)
pre  parse_positional                    ... bench:       5,855 ns/iter (+/- 498)
post parse_sc_clean                      ... bench:       6,358 ns/iter (+/- 821)
pre  parse_sc_clean                      ... bench:       6,476 ns/iter (+/- 191)
post parse_sc_complex                    ... bench:       8,012 ns/iter (+/- 439)
pre  parse_sc_complex                    ... bench:       8,259 ns/iter (+/- 262)
post parse_sc_flag                       ... bench:       7,087 ns/iter (+/- 231)
pre  parse_sc_flag                       ... bench:       7,185 ns/iter (+/- 165)
pre  parse_sc_option                     ... bench:       7,181 ns/iter (+/- 279)
post parse_sc_option                     ... bench:       7,393 ns/iter (+/- 420)
post parse_sc_positional                 ... bench:       6,756 ns/iter (+/- 228)
pre  parse_sc_positional                 ... bench:       6,985 ns/iter (+/- 150)
post parse_subcommands ... bench:      17,355 ns/iter (+/- 373)
pre  parse_subcommands ... bench:      17,510 ns/iter (+/- 3,767)

And the significant changes show the build help tests like you mentioned, but also the parse_lots for ripgrep. Granted the diff is within the noise ratio.

Merged Pre/Post Bench - Only significant

pre  build_help_long  ... bench:     191,834 ns/iter (+/- 11,202)
post build_help_long  ... bench:     226,031 ns/iter (+/- 4,502)
post build_help_short ... bench:     100,864 ns/iter (+/- 1,933)
pre  build_help_short ... bench:      91,475 ns/iter (+/- 1,159)
pre  parse_lots       ... bench:     382,522 ns/iter (+/- 12,048)
post parse_lots       ... bench:     390,230 ns/iter (+/- 5,028)

I'm OK with this diff though, because the build help should only be in the failure path or "we're about to exit" path...and 0.02 ms is pretty insignificant for displaying a help message 😉

@kbknapp kbknapp merged commit cd046a7 into clap-rs:master Feb 25, 2018
@kbknapp kbknapp mentioned this pull request Mar 4, 2018
87 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants