Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark libserialize functions as inline #53393

Merged
merged 2 commits into from
Aug 21, 2018

Conversation

BurntPizza
Copy link
Contributor

Got to thinking: "what if that big pile of tiny functions isn't inlining as it should?"
So a few replace-regex later the local perf run says this:

Not huge, but still a win, which is interesting. Want to verify with the real perf run, but I understand there's a backlog.

I didn't notice any increase in compile time or binary sizes for rustc/libs.

@rust-highfive
Copy link
Collaborator

r? @dtolnay

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 15, 2018
@dtolnay
Copy link
Member

dtolnay commented Aug 15, 2018

Nice find. Looks good to me if the benchmarks are corroborated.

@bors try
@Mark-Simulacrum @kennytm

@bors
Copy link
Contributor

bors commented Aug 15, 2018

⌛ Trying commit e59ea2fc56ef72b3ac332ccd28310d0662a186c0 with merge e895c391336eabdd2ec4d9f4c94a45c94f6902b0...

fn decode<D: Decoder>(d: &mut D) -> Result<Box<[T]>, D::Error> {
let v: Vec<T> = Decodable::decode(d)?;
Ok(v.into_boxed_slice())
}
}

impl<T:Encodable> Encodable for Rc<T> {
#[inline]
#[inline]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearly this isn't my fault, as nobody made a lint for it!

#[inline]
fn encode<S: Encoder>(&self, s: &mut S) -> Result<(), S::Error> {
(**self).encode(s)
}
}

impl<T:Decodable> Decodable for Rc<T> {
#[inline]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:O

@BurntPizza
Copy link
Contributor Author

Rebased.

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two #[inline]s.

@BurntPizza
Copy link
Contributor Author

@Mark-Simulacrum Could I get a perf run? Or are we waiting on making a dent in the queue?

@@ -20,6 +20,7 @@ use std::sync::Arc;
impl<
T: Encodable
> Encodable for LinkedList<T> {
#[inline]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought #[inline] was unnecessary on generic methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, ever since multiple codegen units, it can matter. I forget.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the discussion I remembered: #52704

@@ -193,6 +194,7 @@ impl<'a> Decoder<'a> {
self.position += bytes;
}

#[inline]
pub fn read_raw_bytes(&mut self, s: &mut [u8]) -> Result<(), String> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The #[inline]s on things like this and write_signed_leb128 seem definitely good, though 👍

@Mark-Simulacrum
Copy link
Member

@rust-timer build e895c391336eabdd2ec4d9f4c94a45c94f6902b0

@rust-timer
Copy link
Collaborator

Success: Queued e895c391336eabdd2ec4d9f4c94a45c94f6902b0 with parent 5db71db, comparison URL.

@BurntPizza
Copy link
Contributor Author

Looks like it's ready for r+

@dtolnay
Copy link
Member

dtolnay commented Aug 17, 2018

Thanks @scottmcm, I hadn't seen that thread. @alexcrichton can you tell whether any of the downsides of #[inline] would apply to these functions in libserialize? These are only intended to be called by the compiler (unlike functions in the standard library) and the perf comparison looks very good to me.

r? @alexcrichton

@BurntPizza
Copy link
Contributor Author

BurntPizza commented Aug 17, 2018

@dtolnay
There's probably a subset of the #[inline]s that were added that are responsible for the perf increase, and there may be a subset that is detrimental, but finding out what they are would be quite tedious, so that's why I just marked everything and tried it.

@alexcrichton
Copy link
Member

Ah yeah I explained in a bit more detail here, but none of the #[inline] on generics should be necessary but the tags pointed out by @scottmcm are likely responsible for the speedup! Could those be left and the others removed?

@dtolnay
Copy link
Member

dtolnay commented Aug 17, 2018

Thanks, we'll remove all the ones on generic functions.

r? @dtolnay

@BurntPizza
Copy link
Contributor Author

Done. I'd like to check perf again, just to be sure.

@kennytm
Copy link
Member

kennytm commented Aug 17, 2018

@bors try

@kennytm kennytm added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 17, 2018
@bors
Copy link
Contributor

bors commented Aug 17, 2018

⌛ Trying commit 1540e8c with merge 4c76652fddb127b9ed67aa60a665ab1d37f0e97f...

@bors
Copy link
Contributor

bors commented Aug 17, 2018

☀️ Test successful - status-travis
State: approved= try=True

@kennytm
Copy link
Member

kennytm commented Aug 17, 2018

@rust-timer build 4c76652fddb127b9ed67aa60a665ab1d37f0e97f

@rust-timer
Copy link
Collaborator

Success: Queued 4c76652fddb127b9ed67aa60a665ab1d37f0e97f with parent f34933b, comparison URL.

@BurntPizza
Copy link
Contributor Author

BurntPizza commented Aug 17, 2018

Ah-ha! Interesting. Looks too consistently worse to e noise.
cc @alexcrichton What do you make of this? Did I remove some inlines that weren't generic? Maybe there's a interaction with specialization (some of the inlines removed were on specialized functions)?
EDIT: or LLVM just actually needs the hints here.

@kennytm kennytm removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 17, 2018
@dtolnay dtolnay assigned alexcrichton and unassigned dtolnay Aug 17, 2018
@alexcrichton
Copy link
Member

@bors: r+

Ok I'm gonna go ahead and approve this because the functions tagged for inlining here are "obvious candidates for inlining". While it's true that not all the performance was recovered from inlining everything I think this is a good baseline to start from. I would personally prefer further analysis to figure out which functions actually need to be inlined instead of inlining all of them, which may be possible through diffing perf runs with everything inlined and without everything inlined (hottest functions at the top are likely the candidates to be inlined).

I suspect you're right in that LLVM just actually needs an extra hint for some of the functions here to inline them.

@bors
Copy link
Contributor

bors commented Aug 20, 2018

📌 Commit 1540e8c has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 20, 2018
kennytm added a commit to kennytm/rust that referenced this pull request Aug 21, 2018
…xcrichton

Mark libserialize functions as inline

Got to thinking: "what if that big pile of tiny functions isn't inlining as it should?"
So a few `replace-regex` later the local perf run says this:
<details>

![](https://i.imgur.com/gvdJEgG.png)
</details>
Not huge, but still a win, which is interesting. Want to verify with the real perf run, but I understand there's a backlog.

I didn't notice any increase in compile time or binary sizes for rustc/libs.
bors added a commit that referenced this pull request Aug 21, 2018
Rollup of 17 pull requests

Successful merges:

 - #53030 (Updated RELEASES.md for 1.29.0)
 - #53104 (expand the documentation on the `Unpin` trait)
 - #53213 (Stabilize IP associated constants)
 - #53296 (When closure with no arguments was expected, suggest wrapping)
 - #53329 (Replace usages of ptr::offset with ptr::{add,sub}.)
 - #53363 (add individual docs to `core::num::NonZero*`)
 - #53370 (Stabilize macro_vis_matcher)
 - #53393 (Mark libserialize functions as inline)
 - #53405 (restore the page title after escaping out of a search)
 - #53452 (Change target triple used to check for lldb in build-manifest)
 - #53462 (Document Box::into_raw returns non-null ptr)
 - #53465 (Remove LinkMeta struct)
 - #53492 (update lld submodule to include RISCV patch)
 - #53496 (Fix typos found by codespell.)
 - #53521 (syntax: Optimize some literal parsing)
 - #53540 (Moved issue-53157.rs into src/test/ui/consts/const-eval/)
 - #53551 (Avoid some Place clones.)

Failed merges:

r? @ghost
@bors bors merged commit 1540e8c into rust-lang:master Aug 21, 2018
@bors
Copy link
Contributor

bors commented Aug 21, 2018

⌛ Testing commit 1540e8c with merge 9f9f2c0...

@BurntPizza BurntPizza deleted the serialize-inlines branch August 24, 2018 00:51
@kennytm
Copy link
Member

kennytm commented Aug 24, 2018

@bors retry r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants