revisit #47137: avoid round-trip of locally-cached inferred source #51960

aviatesk · 2023-10-31T17:20:14Z

Built on top of #51958, with the improved performance of cfg_simplify!, let's give another try on #47137. Tha aim is to retain locally cached inferred source as IRCode, eliminating the need for the inlining algorithm to round-trip it through CodeInfo representation.

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-10-31T17:35:11Z

Your job failed.

aviatesk · 2023-10-31T17:50:52Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-10-31T18:10:03Z

Your job failed.

vtjnash · 2023-10-31T20:10:59Z

FWIW, you are running into an optimizer bug it seems

      From worker 2:    ERROR: LoadError: MethodError: no method matching string(::Expr)                                                                                                          
      From worker 2:                                                                                                                                                                              
      From worker 2:    Stacktrace:                                                              
      From worker 2:       [1] macro expansion                                                                                                                                                    
      From worker 2:         @ Core.Compiler ./error.jl:231 [inlined]                            
      From worker 2:       [2] process_node!(compact::Core.Compiler.IncrementalCompact, result_idx::Int64, inst::Core.Compiler.Instruction, idx::Int64, processed_idx::Int64, active_bb::Int64, do
_rename_ssa::Bool)                              
      From worker 2:         @ Core.Compiler ./compiler/ssair/ir.jl:1348                         
      From worker 2:       [3] cfg_simplify!(ir::Core.Compiler.IRCode)                           
      From worker 2:         @ Core.Compiler ./compiler/ssair/passes.jl:2275                     
      From worker 2:       [4] finish!(interp::Core.Compiler.NativeInterpreter, caller::Core.Compiler.InferenceState)                                                                             
...
      From worker 2:     [100] loadall!()
      From worker 2:         @ BaseBenchmarks /home/nanosoldier/.julia/dev/BaseBenchmarks/src/BaseBenchmarks.jl:52
      From worker 2:    in expression starting at /home/nanosoldier/.julia/dev/BaseBenchmarks/src/collection/CollectionBenchmarks.jl:1
      From worker 2:    in expression starting at /home/nanosoldier/.julia/scratchspaces/89f34f1a-2e6b-52eb-a20f-77051b03b735/workdir/jl_5yuKNX/benchscript.jl:18
┌ Info: [Node 2 | 2023-10-31T14:10:04.615]: failed job: BenchmarkJob JuliaLang/julia@a79443e vs. JuliaLang/julia@f631597

…sult (#51934) Currently the inlining algorithm is allowed to use inferred source of const-prop'ed call that is always locally available (since const-prop' result isn't cached globally). For non const-prop'ed and globally cached calls, however, it undergoes a more expensive process, making a round-trip through serialized inferred source. We can improve efficiency by bypassing the serialization round-trip for newly-inferred and globally-cached frames. As these frames are never cached locally, they can be viewed as volatile. This means we can use their source destructively while inline-expanding them. The benchmark results show that this optimization achieves 2-4% allocation reduction and about 5% speed up in the real-world-ish compilation targets (`allinference`). Note that it would be more efficient to propagate `IRCode` object directly and skip inflation from `CodeInfo` to `IRCode` as experimented in #47137, but currently the round-trip through `CodeInfo`-representation is necessary because it often leads to better CFG simplification while `cfg_simplify!` being expensive (xref: #51960).

aviatesk · 2023-11-10T01:38:12Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-11-10T02:39:03Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2023-11-11T08:30:12Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-11-11T09:30:54Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2023-11-22T16:55:12Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-11-22T17:11:54Z

Your job failed.

Built on top of #51958, with the improved performance of `cfg_simplify!`, let's give another try on #47137. Tha aim is to retain locally cached inferred source as `IRCode`, eliminating the need for the inlining algorithm to round-trip it through `CodeInfo` representation.

aviatesk · 2023-12-11T07:05:26Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2023-12-11T08:02:58Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk requested a review from Keno October 31, 2023 17:20

aviatesk added the compiler:optimizer Optimization passes (mostly in base/compiler/ssair/) label Oct 31, 2023

aviatesk force-pushed the avi/47137-again branch from 3b0918c to a79443e Compare October 31, 2023 17:42

aviatesk force-pushed the avi/opt-cfg_simplify! branch from 512deed to d54aea5 Compare November 1, 2023 05:06

aviatesk mentioned this pull request Nov 1, 2023

optimize cfg_simplify! #51958

Merged

aviatesk force-pushed the avi/opt-cfg_simplify! branch 2 times, most recently from 5f71c22 to 5152327 Compare November 5, 2023 04:14

Base automatically changed from avi/opt-cfg_simplify! to master November 6, 2023 01:44

aviatesk mentioned this pull request Nov 6, 2023

inlining: avoid source deserialization by using volatile inference result #51934

Merged

aviatesk force-pushed the avi/47137-again branch from a79443e to f1817e9 Compare November 6, 2023 16:23

aviatesk mentioned this pull request Nov 7, 2023

cfg_simplify! error #52058

Closed

aviatesk force-pushed the avi/47137-again branch 2 times, most recently from 1b97766 to 5af6b22 Compare November 10, 2023 01:01

aviatesk force-pushed the avi/47137-again branch from 5af6b22 to ef9e687 Compare November 11, 2023 08:29

aviatesk force-pushed the avi/47137-again branch 2 times, most recently from a77dd0c to 7050b4e Compare November 22, 2023 16:55

aviatesk force-pushed the avi/47137-again branch from 7050b4e to 178b9ab Compare December 11, 2023 06:49

aviatesk force-pushed the avi/47137-again branch from 178b9ab to a6f2b7f Compare December 11, 2023 07:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

revisit #47137: avoid round-trip of locally-cached inferred source #51960

revisit #47137: avoid round-trip of locally-cached inferred source #51960

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

vtjnash commented Oct 31, 2023

aviatesk commented Nov 10, 2023

nanosoldier commented Nov 10, 2023

aviatesk commented Nov 11, 2023

nanosoldier commented Nov 11, 2023

aviatesk commented Nov 22, 2023

nanosoldier commented Nov 22, 2023

aviatesk commented Dec 11, 2023

nanosoldier commented Dec 11, 2023

revisit #47137: avoid round-trip of locally-cached inferred source #51960

Are you sure you want to change the base?

revisit #47137: avoid round-trip of locally-cached inferred source #51960

Conversation

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

aviatesk commented Oct 31, 2023

nanosoldier commented Oct 31, 2023

vtjnash commented Oct 31, 2023

aviatesk commented Nov 10, 2023

nanosoldier commented Nov 10, 2023

aviatesk commented Nov 11, 2023

nanosoldier commented Nov 11, 2023

aviatesk commented Nov 22, 2023

nanosoldier commented Nov 22, 2023

aviatesk commented Dec 11, 2023

nanosoldier commented Dec 11, 2023