Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interpret: refactor projection handling code #99101

Merged
merged 2 commits into from
Jul 13, 2022

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Jul 9, 2022

Moves our projection handling code into a common file, and avoids the use of a
general mplace-based fallback function by have more specialized implementations.

mplace_index (and the other slice-related functions) could be more efficient by
copy-pasting the body of operand_index. Or we could do some trait magic to share
the code between them. But for now this is probably fine.

This is the common part of #99013 and #99097. I am seeing some strange perf results so this probably should be its own change so we know which diff caused which perf changes...

r? @oli-obk

@rustbot
Copy link
Collaborator

rustbot commented Jul 9, 2022

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jul 9, 2022
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 9, 2022
@RalfJung
Copy link
Member Author

First testing the version that I think is the faster of the two, since it avoids introducing new assert_mem_place.
If this is still too slow, my guess is that the loop in operand_array_fields is slightly slower than the old one in mplace_array_fields, and I have some ideas for that.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 10, 2022
@bors
Copy link
Contributor

bors commented Jul 10, 2022

⌛ Trying commit af4c939cd1568eb13c4d1581664fd8840a90eaaf with merge b9fe2470ecb8d858b78e60c1ab3143012cdef548...

@bors
Copy link
Contributor

bors commented Jul 10, 2022

☀️ Try build successful - checks-actions
Build commit: b9fe2470ecb8d858b78e60c1ab3143012cdef548 (b9fe2470ecb8d858b78e60c1ab3143012cdef548)

@rust-timer
Copy link
Collaborator

Queued b9fe2470ecb8d858b78e60c1ab3143012cdef548 with parent 6dba4ed, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (b9fe2470ecb8d858b78e60c1ab3143012cdef548): comparison url.

Instruction count

  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: 😿 relevant regressions found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
5.1% 6.9% 8
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) N/A N/A 0

Max RSS (memory usage)

Results
  • Primary benchmarks: 🎉 relevant improvement found
  • Secondary benchmarks: no relevant changes found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-5.0% -5.0% 1
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) -5.0% -5.0% 1

Cycles

Results
  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: 😿 relevant regressions found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
8.3% 10.5% 6
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) N/A N/A 0

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jul 10, 2022
@RalfJung
Copy link
Member Author

I have some ideas for that.

Ah, of course that doesn't work since impl Iterator needs to always be the same type. Even if we Either this, that will just re-introduce the runtime check that I was trying to avoid (replacing the try_as_mplace we currently have in OpTy::offset).

I think I really need some valgrind profiles here to even know if operand_array_fields truly is the problem. It seems hard to believe.

@oli-obk
Copy link
Contributor

oli-obk commented Jul 11, 2022

valgrind diff

-2,582,601,110  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::run
 1,416,740,602  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::place_projection
   780,827,916  ???:<core::iter::adapters::copied::Copied<core::slice::iter::Iter<rustc_middle::mir::syntax::ProjectionElem<rustc_middle::mir::Local, rustc_middle::ty::Ty>>> as core::iter::traits::iterator::Iterator>::try_fold::<rustc_const_eval::interpret::place::PlaceTy, <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_place::{closure
   517,636,783  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::operand_projection
   422,177,426  ???:<rustc_middle::ty::context::TyCtxt>::try_subst_and_normalize_erasing_regions::<rustc_middle::mir::ConstantKind>
  -234,321,940  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_place
   231,210,917  ???:<hashbrown::map::RawEntryBuilder<rustc_middle::ty::ParamEnvAnd<rustc_middle::mir::interpret::GlobalId>, (core::result::Result<rustc_middle::mir::interpret::value::ConstAlloc, rustc_middle::mir::interpret::error::ErrorHandled>, rustc_query_system::dep_graph::graph::DepNodeIndex), core::hash::BuildHasherDefault<rustc_hash::FxHasher>>>::from_key_hashed_nocheck::<rustc_middle::ty::ParamEnvAnd<rustc_middle::mir::interpret::GlobalId>>
   193,633,898  ???:<core::iter::adapters::copied::Copied<core::slice::iter::Iter<rustc_middle::mir::syntax::ProjectionElem<rustc_middle::mir::Local, rustc_middle::ty::Ty>>> as core::iter::traits::iterator::Iterator>::try_fold::<rustc_const_eval::interpret::operand::OpTy, <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_place_to_op::{closure
   193,462,272  ???:<rustc_middle::ty::context::TyCtxt>::def_kind::<rustc_span::def_id::DefId>
    82,575,360  ???:<rustc_middle::ty::ParamEnvAnd<rustc_middle::mir::interpret::GlobalId> as core::hash::Hash>::hash::<rustc_hash::FxHasher>
    77,440,963  ???:core::iter::adapters::try_process::<core::iter::adapters::map::Map<core::slice::iter::Iter<rustc_middle::mir::syntax::Operand>, <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_operands::{closure
    68,975,205  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_fn_call
    56,819,838  ???:<rustc_middle::mir::interpret::value::Scalar>::to_bool
    52,187,019  ???:<rustc_middle::ty::context::TyCtxt>::normalize_erasing_late_bound_regions::<rustc_middle::ty::sty::FnSig>
   -50,331,648  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::mplace_projection
    39,303,720  ???:<rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::copy_op_no_validate
    35,599,183  ???:<hashbrown::map::RawEntryBuilder<rustc_middle::ty::ParamEnvAnd<(rustc_middle::ty::instance::Instance, &rustc_middle::ty::list::List<rustc_middle::ty::Ty>)>, (core::result::Result<&rustc_target::abi::call::FnAbi<rustc_middle::ty::Ty>, rustc_middle::ty::layout::FnAbiError>, rustc_query_system::dep_graph::graph::DepNodeIndex), core::hash::BuildHasherDefault<rustc_hash::FxHasher>>>::from_key_hashed_nocheck::<rustc_middle::ty::ParamEnvAnd<(rustc_middle::ty::instance::Instance, &rustc_middle::ty::list::List<rustc_middle::ty::Ty>)>>
     9,909,729  ???:<rustc_middle::ty::ParamEnvAnd<(rustc_middle::ty::instance::Instance, &rustc_middle::ty::list::List<rustc_middle::ty::Ty>)> as core::hash::Hash>::hash::<rustc_hash::FxHasher>
     8,441,621  ???:<rustc_const_eval::const_eval::machine::CompileTimeInterpreter as rustc_const_eval::interpret::machine::Machine>::after_stack_pop
     6,606,486  ???:<rustc_middle::ty::context::TyCtxt>::mk_type_list::<core::iter::adapters::map::Map<core::slice::iter::Iter<rustc_const_eval::interpret::operand::OpTy>, <rustc_const_eval::interpret::eval_context::InterpCx<rustc_const_eval::const_eval::machine::CompileTimeInterpreter>>::eval_terminator::{closure
     2,657,099  ???:<rustc_middle::ty::context::TyCtxt>::intern_type_list
    -2,359,340  ???:<rustc_middle::ty::normalize_erasing_regions::TryNormalizeAfterErasingRegionsFolder as rustc_middle::ty::fold::FallibleTypeFolder>::try_fold_mir_const
     1,974,134  ???:<rustc_middle::ty::instance::Instance>::resolve_opt_const_arg

most likely an inlining issue, slapping #[inline(never)] on run should do the trick

@RalfJung
Copy link
Member Author

Thanks!

How do I read that diff? run is taking 2,582,601,110 instructions fewer than it used to...? And place_projection is taking a lot more? But eval_place actually got cheaper? confused

@oli-obk
Copy link
Contributor

oli-obk commented Jul 11, 2022

yes, that's exactly how to read that diff. I'll try to produce percentages, too, anything that goes to 0% is inlined away

@RalfJung
Copy link
Member Author

But then why should inline(never) on run help...? run got cheaper! We are spending fewer instructions on the entire CTFE loop. How can that be a slowdown?!?

place_projection getting more expensive is odd, it does basically the same thing as before...

@RalfJung
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 11, 2022
@bors
Copy link
Contributor

bors commented Jul 11, 2022

⌛ Trying commit 55bab620a371ab69ac1d8eb76d2a88cb8e04bcc3 with merge 88bfc4522b4c58ab7e03dfeace62a6a358f8837c...

@oli-obk
Copy link
Contributor

oli-obk commented Jul 11, 2022

run got cheaper! We are spending fewer instructions on the entire CTFE loop. How can that be a slowdown?!?

it may just have gotten inlined into all callers, and thus disappeared from cachegrind.

@bors
Copy link
Contributor

bors commented Jul 11, 2022

☀️ Try build successful - checks-actions
Build commit: 88bfc4522b4c58ab7e03dfeace62a6a358f8837c (88bfc4522b4c58ab7e03dfeace62a6a358f8837c)

@rust-timer
Copy link
Collaborator

Queued 88bfc4522b4c58ab7e03dfeace62a6a358f8837c with parent 7d1f57a, future comparison URL.

@bors
Copy link
Contributor

bors commented Jul 11, 2022

☀️ Try build successful - checks-actions
Build commit: cdd68a378c6fa7e29eec68b2b63999c5dfc2a3f3 (cdd68a378c6fa7e29eec68b2b63999c5dfc2a3f3)

@rust-timer
Copy link
Collaborator

Queued cdd68a378c6fa7e29eec68b2b63999c5dfc2a3f3 with parent 38b7215, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (cdd68a378c6fa7e29eec68b2b63999c5dfc2a3f3): comparison url.

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results
  • Primary benchmarks: 🎉 relevant improvement found
  • Secondary benchmarks: 😿 relevant regression found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
3.1% 3.1% 1
Improvements 🎉
(primary)
-2.4% -2.4% 1
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) -2.4% -2.4% 1

Cycles

Results
  • Primary benchmarks: 😿 relevant regressions found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
2.7% 5.3% 7
Regressions 😿
(secondary)
3.4% 5.5% 5
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-8.3% -8.3% 1
All 😿🎉 (primary) 2.7% 5.3% 7

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2

  2. number of relevant changes 2

@rustbot rustbot removed S-waiting-on-perf Status: Waiting on a perf run to be completed. perf-regression Performance regression. labels Jul 12, 2022
Moves our projection handling code into a common file, and avoids the use of a
general mplace-based fallback function by have more specialized implementations.

mplace_index (and the other slice-related functions) could be more efficient by
copy-pasting the body of operand_index. Or we could do some trait magic to share
the code between them. But for now this is probably fine.
@RalfJung
Copy link
Member Author

Looks like fold was the problem. If perf still looks good, this is ready for review.

@bors try @rust-timer queue
@rustbot ready

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 12, 2022
@bors
Copy link
Contributor

bors commented Jul 12, 2022

⌛ Trying commit 04b3cd9 with merge da916a909deef1bbb257db0f54cb5347ebdc86b0...

@bors
Copy link
Contributor

bors commented Jul 12, 2022

☀️ Try build successful - checks-actions
Build commit: da916a909deef1bbb257db0f54cb5347ebdc86b0 (da916a909deef1bbb257db0f54cb5347ebdc86b0)

@rust-timer
Copy link
Collaborator

Queued da916a909deef1bbb257db0f54cb5347ebdc86b0 with parent 8a33254, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (da916a909deef1bbb257db0f54cb5347ebdc86b0): comparison url.

Instruction count

  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-1.9% -2.2% 6
All 😿🎉 (primary) N/A N/A 0

Max RSS (memory usage)

Results
  • Primary benchmarks: 🎉 relevant improvement found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
6.4% 6.4% 1
Improvements 🎉
(primary)
-2.9% -2.9% 1
Improvements 🎉
(secondary)
-2.1% -2.3% 2
All 😿🎉 (primary) -2.9% -2.9% 1

Cycles

Results
  • Primary benchmarks: 🎉 relevant improvement found
  • Secondary benchmarks: no relevant changes found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-2.3% -2.3% 1
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) -2.3% -2.3% 1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 12, 2022
@oli-obk
Copy link
Contributor

oli-obk commented Jul 12, 2022

@bors r+

@bors
Copy link
Contributor

bors commented Jul 12, 2022

📌 Commit 04b3cd9 has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 12, 2022
@bors
Copy link
Contributor

bors commented Jul 13, 2022

⌛ Testing commit 04b3cd9 with merge 7b57152...

@bors
Copy link
Contributor

bors commented Jul 13, 2022

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 7b57152 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 13, 2022
@bors bors merged commit 7b57152 into rust-lang:master Jul 13, 2022
@rustbot rustbot added this to the 1.64.0 milestone Jul 13, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (7b57152): comparison url.

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results
  • Primary benchmarks: 😿 relevant regression found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
3.7% 3.7% 1
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-3.9% -5.0% 2
All 😿🎉 (primary) 3.7% 3.7% 1

Cycles

Results
  • Primary benchmarks: 🎉 relevant improvement found
  • Secondary benchmarks: no relevant changes found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-2.3% -2.3% 1
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) -2.3% -2.3% 1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2

  2. number of relevant changes 2

@RalfJung RalfJung deleted the interpret-projections branch July 13, 2022 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants