Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unicode_normalization benchmark from rustc-perf is slow #57718

Closed
nnethercote opened this issue Jan 18, 2019 · 3 comments
Closed

unicode_normalization benchmark from rustc-perf is slow #57718

nnethercote opened this issue Jan 18, 2019 · 3 comments
Labels
I-compiletime Issue: Problems and improvements with respect to compile times.

Comments

@nnethercote
Copy link
Contributor

The unicode_normalization benchmark was just added to rustc-perf: rust-lang/rustc-perf#328

Here is the high-level data from Cachegrind for a Clean-Check run.

--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
45,053,275,971 (100.0%)  PROGRAM TOTALS

--------------------------------------------------------------------------------
Ir                      file:function
--------------------------------------------------------------------------------
3,743,278,359 ( 8.31%)  /home/njn/moz/rust0/src/librustc/infer/lexical_region_resolve/mod.rs:rustc::infer::lexical_region_resolve::LexicalResolver::expand_node
3,027,925,390 ( 6.72%)  /home/njn/moz/rust0/src/librustc/middle/region.rs:<rustc::ty::sty::RegionKind as core::cmp::PartialEq>::eq
2,217,916,340 ( 4.92%)  /home/njn/moz/rust0/src/librustc/ty/sty.rs:<rustc::ty::sty::RegionKind as core::cmp::PartialEq>::eq
2,165,108,405 ( 4.81%)  /home/njn/moz/rust0/src/librustc/ty/query/plumbing.rs:rustc::ty::query::plumbing::<impl rustc::ty::context::TyCtxt<'a, 'gcx, 'tcx>>::try_get_with
1,872,071,197 ( 4.16%)  /home/njn/moz/rust0/src/librustc/infer/lexical_region_resolve/mod.rs:rustc::infer::lexical_region_resolve::LexicalResolver::infer_variable_values
1,782,556,352 ( 3.96%)  /home/njn/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/smallvec-0.6.7/lib.rs:rustc::infer::lexical_region_resolve::LexicalResolver::infer_variable_values
1,678,166,351 ( 3.72%)  /home/njn/moz/rust0/src/libstd/collections/hash/map.rs:rustc::ty::query::plumbing::<impl rustc::ty::context::TyCtxt<'a, 'gcx, 'tcx>>::try_get_with
1,652,330,349 ( 3.67%)  /home/njn/moz/rust0/src/libstd/collections/hash/table.rs:rustc::ty::query::plumbing::<impl rustc::ty::context::TyCtxt<'a, 'gcx, 'tcx>>::try_get_with
1,491,568,194 ( 3.31%)  /home/njn/moz/rust0/src/librustc_mir/hair/pattern/_match.rs:rustc_mir::hair::pattern::_match::IntRange::from_ctor
1,142,511,763 ( 2.54%)  /home/njn/moz/rust0/src/librustc/ty/layout.rs:<rustc::ty::layout::LayoutCx<'tcx, rustc::ty::context::TyCtxt<'a, 'tcx, 'tcx>> as rustc_target::abi::LayoutOf>::
layout_of
  983,408,925 ( 2.18%)  /home/njn/moz/rust0/src/libcore/num/mod.rs:rustc::ty::query::plumbing::<impl rustc::ty::context::TyCtxt<'a, 'gcx, 'tcx>>::try_get_with
  925,902,672 ( 2.06%)  /home/njn/moz/rust0/src/libcore/option.rs:rustc::ty::sty::Const::assert_bits
  836,542,801 ( 1.86%)  /home/njn/moz/rust0/src/libcore/ptr.rs:rustc::ty::query::plumbing::<impl rustc::ty::context::TyCtxt<'a, 'gcx, 'tcx>>::try_get_with
  738,053,639 ( 1.64%)  /home/njn/moz/rust0/src/librustc_mir/hair/pattern/_match.rs:rustc_mir::hair::pattern::_match::constructor_intersects_pattern
  698,283,902 ( 1.55%)  /home/njn/moz/rust0/src/librustc_mir/hair/pattern/_match.rs:rustc_mir::hair::pattern::_match::specialize
  623,416,500 ( 1.38%)  /home/njn/moz/rust0/src/libcore/cmp.rs:rustc::infer::lexical_region_resolve::LexicalResolver::expand_node
  605,680,300 ( 1.34%)  /home/njn/moz/rust0/<::rustc_data_structures::indexed_vec::newtype_index macros>:<rustc::ty::sty::RegionKind as core::cmp::PartialEq>::eq
  590,667,449 ( 1.31%)  /home/njn/moz/rust0/src/librustc/ty/context.rs:<&'a rustc::ty::TyS<'a> as rustc::ty::context::Lift<'tcx>>::lift_to_tcx
  588,790,903 ( 1.31%)  /home/njn/moz/rust0/src/librustc_mir/hair/pattern/_match.rs:rustc_mir::hair::pattern::_match::IntRange::from_pat
  566,270,290 ( 1.26%)  /home/njn/moz/rust0/src/librustc/ty/sty.rs:rustc::ty::sty::Const::assert_bits
  535,797,872 ( 1.19%)  /home/njn/moz/rust0/src/libcore/slice/mod.rs:rustc::infer::lexical_region_resolve::LexicalResolver::infer_variable_values
  531,048,306 ( 1.18%)  /home/njn/moz/rust0/src/librustc/ty/mod.rs:rustc::ty::ParamEnv::and
  504,837,733 ( 1.12%)  /home/njn/moz/rust0/src/librustc/ty/layout.rs:<rustc::ty::layout::LayoutCx<'tcx, rustc::ty::context::TyCtxt<'a, 'tcx, 'tcx>>>::record_layout_for_printing
  491,555,270 ( 1.09%)  /home/njn/moz/rust0/src/librustc/mir/interpret/value.rs:<rustc::mir::interpret::value::Scalar<Tag>>::to_bits
  478,260,738 ( 1.06%)  /home/njn/moz/rust0/src/librustc/ty/query/plumbing.rs:<rustc::ty::layout::LayoutCx<'tcx, rustc::ty::context::TyCtxt<'a, 'tcx, 'tcx>> as rustc_target::abi::Lay
outOf>::layout_of

The first three entries show that expand_node is super-hot:

fn expand_node(
&self,
a_region: Region<'tcx>,
b_vid: RegionVid,
b_data: &mut VarValue<'tcx>,
) -> bool {
debug!("expand_node({:?}, {:?} == {:?})", a_region, b_vid, b_data);
match *a_region {
// Check if this relationship is implied by a given.
ty::ReEarlyBound(_) | ty::ReFree(_) => if self.data.givens.contains(&(a_region, b_vid))
{
debug!("given");
return false;
},
_ => {}
}
match *b_data {
VarValue::Value(cur_region) => {
let mut lub = self.lub_concrete_regions(a_region, cur_region);
if lub == cur_region {
return false;
}
// Watch out for `'b: !1` relationships, where the
// universe of `'b` can't name the placeholder `!1`. In
// that case, we have to grow `'b` to be `'static` for the
// relationship to hold. This is obviously a kind of sub-optimal
// choice -- in the future, when we incorporate a knowledge
// of the parameter environment, we might be able to find a
// tighter bound than `'static`.
//
// (This might e.g. arise from being asked to prove `for<'a> { 'b: 'a }`.)
let b_universe = self.var_infos[b_vid].universe;
if let ty::RePlaceholder(p) = lub {
if b_universe.cannot_name(p.universe) {
lub = self.tcx().types.re_static;
}
}
debug!(
"Expanding value of {:?} from {:?} to {:?}",
b_vid, cur_region, lub
);
*b_data = VarValue::Value(lub);
return true;
}
VarValue::ErrorValue => {
return false;
}
}
}

Most of the calls take the return on line 241, i.e. the lub == cur_region comparison succeeds. That explains why <rustc::ty::sty::RegionKind as core::cmp::PartialEq> shows up as hot.

expand_node is called from a closure within expansion:

fn expansion(&self, var_values: &mut LexicalRegionResolutions<'tcx>) {
self.iterate_until_fixed_point("Expansion", |constraint, origin| {
debug!("expansion: constraint={:?} origin={:?}", constraint, origin);
match *constraint {
Constraint::RegSubVar(a_region, b_vid) => {
let b_data = var_values.value_mut(b_vid);
(self.expand_node(a_region, b_vid, b_data), false)
}
Constraint::VarSubVar(a_vid, b_vid) => match *var_values.value(a_vid) {
VarValue::ErrorValue => (false, false),
VarValue::Value(a_region) => {
let b_node = var_values.value_mut(b_vid);
let changed = self.expand_node(a_region, b_vid, b_node);
let retain = match *b_node {
VarValue::Value(ReStatic) | VarValue::ErrorValue => false,
_ => true
};
(changed, retain)
}
},
Constraint::RegSubReg(..) | Constraint::VarSubReg(..) => {
// These constraints are checked after expansion
// is done, in `collect_errors`.
(false, false)
}
}
})
}

That closure is passed to iterate_until_fixed_point:

fn iterate_until_fixed_point<F>(&self, tag: &str, mut body: F)
where
F: FnMut(&Constraint<'tcx>, &SubregionOrigin<'tcx>) -> (bool, bool),
{
let mut constraints: SmallVec<[_; 16]> = self.data.constraints.iter().collect();
let mut iteration = 0;
let mut changed = true;
while changed {
changed = false;
iteration += 1;
debug!("---- {} Iteration {}{}", "#", tag, iteration);
constraints.retain(|(constraint, origin)| {
let (edge_changed, retain) = body(constraint, origin);
if edge_changed {
debug!("Updated due to constraint {:?}", constraint);
changed = true;
}
retain
});
}
debug!("---- {} Complete after {} iteration(s)", tag, iteration);
}

Most of the calls to iterate_until_fixed_point have a tiny number of constraints and iterations. But for unicode_normalization there are two exceptional calls.

The first exceptional call has 30,902 constraints (20,598 VarSubVar ones followed by 10,304 RegSubVar ones), coming from this match:
https://github.com/nnethercote/rustc-perf/blob/47adadaf3541db9382d5fb3cba5ca7c75064219f/collector/benchmarks/unicode_normalization/src/tables.rs#L1788-L3853

It takes 2,064 iterations to process; by the end the constraints list is empty.

The second exceptional call has 55,172 constraints (36,778 VarSubVar ones followed by 18,394 RegSubVar ones), coming from this match:

https://github.com/nnethercote/rustc-perf/blob/47adadaf3541db9382d5fb3cba5ca7c75064219f/collector/benchmarks/unicode_normalization/src/tables.rs#L3855-L7538

It takes 3,682 iterations to process; again, by the end the constraints list is empty.

cc @rust-lang/wg-compiler-performance

@nnethercote
Copy link
Contributor Author

#57719 improves things by up to 4% by inlining expand_node. But that's just tweaking around the edges. An algorithmic improvement will be necessary to really improve things.

@Centril Centril added the I-compiletime Issue: Problems and improvements with respect to compile times. label Jan 18, 2019
@jens1o
Copy link
Contributor

jens1o commented Jan 26, 2019

ref #55528

@nnethercote
Copy link
Contributor Author

No point having this open as well as #55528.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-compiletime Issue: Problems and improvements with respect to compile times.
Projects
None yet
Development

No branches or pull requests

3 participants