Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macro parser performance improvements and refactoring #37701

Merged
merged 10 commits into from
Nov 14, 2016

Conversation

Mark-Simulacrum
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum commented Nov 11, 2016

This PR locally increased performance of #37074 by ~6.6 minutes.

Follow up to #37569, but doesn't focus explicitly on expansion performance.

History is relatively clean, but I can/will do some more polishing if this is deemed mergeable. Partially posting this now so I can get Travis to run tests for me.

r? @jseyfried

@brson brson added the relnotes Marks issues that should be documented in the release notes of the next release. label Nov 11, 2016
@frewsxcv
Copy link
Member

/build/src/libsyntax/ext/tt/macro_parser.rs:494: line longer than 100 chars

match Rc::try_unwrap(nt) {
Ok(token::NtTT(sub_tt)) => tt = sub_tt,
Ok(nt) => {
tt = TokenTree::Token(sp, token::Interpolated(Rc::new(nt)));
Copy link
Contributor

@jseyfried jseyfried Nov 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this recreates the Rc that we just unwrapped. Perhaps this could be avoided with something like:

            while let TokenTree::Token(sp, token::Interpolated(nt)) = tt {
                if let token::NtTT(..) = *nt {
                    match Rc::try_unwrap(nt) {
                        Ok(token::NtTT(sub_tt)) => tt = sub_tt,
                        Ok(_) => unreachable!(),
                        Err(nt_rc) => match *nt_rc {
                            token::NtTT(ref sub_tt) => tt = sub_tt.clone(),
                            _ => unreachable!(),
                        },
                    }
                } else {
                    tt = TokenTree::Token(sp, token::Interpolated(nt.clone()));
                    break
                }
            }

let sub = ei.matches[idx].clone();
new_pos.matches[idx]
.push(Rc::new(MatchedSeq(sub, mk_sp(ei.sp_lo,
span.hi))));
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a way to avoid doing this work here, that would likely bring significant wins. Callgrind reports ~1 trillion instructions (if I'm reading it right) spent on the let sub = ei.matches[idx].clone(); line for the workload I'm profiling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to avoid this work by refactoring Rc out of Vec in NamedMatch::MatchedSeq, that is MatchedSeq(Vec<Rc<NamedMatch>>, syntax_pos::Span) -> MatchedSeq(Rc<Vec<NamedMatch>>, syntax_pos::Span), perhaps with a RefCell between the Rc and Vec.

Then, sub would be an Rc so cloning would be cheap, and Rc<NamedMatch> would never be needed since cloning NamedMatch would be almost as cheap as an cloning an Rc.

@Mark-Simulacrum
Copy link
Member Author

Mark-Simulacrum commented Nov 12, 2016

After realizing that I was benchmarking old code (from the make build system, I switched to rustbuild mid-PR); latest perf gains are even higher: from ~890 seconds with rustc 1.14.0-nightly (cae6ab1c4 2016-11-05) to ~500 seconds with 2189f57 (as of now, that will change if/when this is rebased). This is a performance gain of 6.6 minutes.

Copy link
Contributor

@jseyfried jseyfried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! r=me unless you'd like to add more commits.

@Mark-Simulacrum
Copy link
Member Author

I'll try the optimization suggested by @jseyfried and clean up history a little.

@Mark-Simulacrum
Copy link
Member Author

Okay, so the suggested optimization doesn't work without more changes, since then we start sharing the vector instead of duplicating it--and we don't want to share it. It's something to look into for the future, but I'll leave it as out of scope for this PR since the changes are unlikely to be trivial.

@jseyfried
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Nov 13, 2016

📌 Commit 2189f57 has been approved by jseyfried

@bors
Copy link
Contributor

bors commented Nov 14, 2016

⌛ Testing commit 2189f57 with merge 87b76a5...

bors added a commit that referenced this pull request Nov 14, 2016
…fried

Macro parser performance improvements and refactoring

This PR locally increased performance of #37074 by ~6.6 minutes.

Follow up to #37569, but doesn't focus explicitly on expansion performance.

History is relatively clean, but I can/will do some more polishing if this is deemed mergeable. Partially posting this now so I can get Travis to run tests for me.

r? @jseyfried
@bors bors merged commit 2189f57 into rust-lang:master Nov 14, 2016
@Mark-Simulacrum Mark-Simulacrum deleted the macro-parser-impvement branch December 17, 2016 04:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants