Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add precompilation directives #2484

Merged
merged 3 commits into from
Mar 22, 2021
Merged

Add precompilation directives #2484

merged 3 commits into from
Mar 22, 2021

Conversation

odow
Copy link
Member

@odow odow commented Feb 19, 2021

This PR is an attempt to get precompilation working throughout the JuMP ecosystem. There are a bunch of related PRs in other packages.

Related issues: #2273, #1181

Headline takeaways

  • "Time-to-first-solve" is halved. The diet problem with GLPK:
# Before:
# 13.303004 seconds (43.19 M allocations: 2.462 GiB, 8.20% gc time, 22.66% compilation time)
# After:
# 7.823846 seconds (20.45 M allocations: 1.125 GiB, 4.87% gc time, 21.25% compilation time)
  • There are a looooot of things that don't infer. (Details to follow.)
  • Some precompile statements are invalid.

Trying to isolate reasons, I find a lot of "Julia's specialization heuristics may be responsible," so it seems like even though we think Julia is specializing on the bridges, it actually isn't. (https://docs.julialang.org/en/v1/manual/performance-tips/#Be-aware-of-when-Julia-avoids-specializing)

using SnoopCompile
tinf = @snoopi_deep Foo.stress_precompile()
staleinstances(tinf)

itrigs = inference_triggers(tinf)
mtrigs = accumulate_by_source(Method, itrigs)
using AbstractTrees
itree = trigger_tree(mtrigs[end].itrigs)
suggest(itree.children[1])

julia> suggest(itree.children[1])
/Users/oscar/.julia/dev/MathOptInterface/src/Bridges/lazy_bridge_optimizer.jl:218: non-inferrable or unspecialized call, perhaps annotate node(b::MathOptInterface.Bridges.LazyBridgeOptimizer{MathOptInterface.Utilities.CachingOptimizer{SCS.Optimizer, MathOptInterface.Utilities.UniversalFallback{MathOptInterface.Utilities.Model{Float64}}}}, F::Type{MathOptInterface.ScalarAffineFunction{Float64}}, S::Type{MathOptInterface.GreaterThan{Float64}}) at lazy_bridge_optimizer.jl:218 with type MethodInstance for supports_constraint(::Type{MathOptInterface.Bridges.Constraint.GreaterToIntervalBridge{Float64, F} where F<:MathOptInterface.AbstractScalarFunction}, ::Type{MathOptInterface.ScalarAffineFunction{Float64}}, ::Type{MathOptInterface.GreaterThan{Float64}})
If a noninferrable argument is a type or function, Julia's specialization heuristics may be responsible.

Code

module Foo

using JuMP
using Ipopt
using SCS
using GLPK

function stress_precompile()
    input_dir =
        joinpath(dirname(dirname(pathof(JuMP))), "docs", "src", "examples")
    for (root, dirs, files) in walkdir(input_dir)
        for file in files
            if !endswith(file, ".jl")
                continue
            end
            include(joinpath(root, file))
        end
    end
    return
end

function example_diet()
    categories = ["calories", "protein", "fat", "sodium"]
    category_data = Containers.DenseAxisArray([
        1800 2200;
        91   Inf;
        0    65;
        0    1779
        ], categories, ["min", "max"]
    )
    foods = [
        "hamburger", "chicken", "hot dog", "fries", "macaroni", "pizza",
        "salad", "milk", "ice cream",
    ]
    cost = Containers.DenseAxisArray(
        [2.49, 2.89, 1.50, 1.89, 2.09, 1.99, 2.49, 0.89, 1.59],
        foods
    )
    food_data = Containers.DenseAxisArray(
        [
            410 24 26 730;
            420 32 10 1190;
            560 20 32 1800;
            380  4 19 270;
            320 12 10 930;
            320 15 12 820;
            320 31 12 1230;
            100  8 2.5 125;
            330  8 10 180
        ], foods, categories
    )
    model = Model(GLPK.Optimizer)
    @variables(model, begin
        category_data[c, "min"] <= nutrition[c = categories] <= category_data[c, "max"]
        buy[foods] >= 0
    end)
    @objective(model, Min, sum(cost[f] * buy[f] for f in foods))
    @constraint(model, [c in categories],
        sum(food_data[f, c] * buy[f] for f in foods) == nutrition[c]
    )
    optimize!(model)
    term_status = termination_status(model)
    @assert term_status == MOI.OPTIMAL
    @constraint(model, buy["milk"] + buy["ice cream"] <= 6)
    optimize!(model)
    @assert termination_status(model) == MOI.INFEASIBLE
    @assert primal_status(model) == MOI.NO_SOLUTION
    return
end

end

# ------------------------------------------------------------------------------
# Uncomment this block to generate the precompile directives
#
# using SnoopCompile
# tinf = @snoopi_deep Foo.stress_precompile()
# ttot, pcs = SnoopCompile.parcel(tinf)
# SnoopCompile.write("precompiles", pcs)
# for file in readdir("precompiles")
#     if !endswith(file, ".jl")
#         continue
#     end
#     src = joinpath("precompiles", file)
#     m = match(r"precompile\_(.+)\.jl", file)
#     modules = split(m[1], ".")
#     modules = vcat(modules[1], "src", modules[2:end])
#     if !(modules[1] in ["GLPK", "Ipopt", "JuMP", "MathOptInterface", "SCS"])
#         continue
#     end
#     dest = joinpath("/Users/Oscar/.julia/dev", modules..., "precompile.jl")
#     @show dest
#     cp(src, dest; force = true)
# end
# ------------------------------------------------------------------------------

# ------------------------------------------------------------------------------
# A small example to see the effect of precompiling
#
# Before:
# 13.303004 seconds (43.19 M allocations: 2.462 GiB, 8.20% gc time, 22.66% compilation time)
# After:
# 7.823846 seconds (20.45 M allocations: 1.125 GiB, 4.87% gc time, 21.25% compilation time)
#
# @time Foo.example_diet()
# ------------------------------------------------------------------------------

# ------------------------------------------------------------------------------
# Check effect of precompiling on all examples
#
# Before:
# 83.043400 seconds (240.70 M allocations: 13.937 GiB, 6.46% gc time, 8.99% compilation time)
# After:
# 74.534196 seconds (190.81 M allocations: 11.039 GiB, 5.32% gc time, 7.50% compilation time)
#
# @time Foo.stress_precompile()
# ------------------------------------------------------------------------------

SnoopCompile

https://timholy.github.io/SnoopCompile.jl/stable/snoopi_deep_parcel/

Base.precompile(Tuple{typeof(_to_index_tuple),Tuple{Int64, Function},Tuple{Dict{Int64, Int64}, Dict{Symbol, Int64}}}) # time: 0.002516737
Base.precompile(Tuple{typeof(_to_index_tuple),Tuple{String, Int64, Int64},Tuple{Dict{String, Int64}, Dict{Int64, Int64}, Dict{Int64, Int64}}}) # time: 0.002253078
Base.precompile(Tuple{typeof(has_dependent_sets),Vector{Any},Vector{Any}}) # time: 0.00217896
# TODO: Base.precompile(Tuple{Type{DenseAxisArray},Core.Array{T, N},Any,Tuple{Vararg{Dict, N}} where N}) # time: 0.001977609
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timholy I found that SnoopCompile.parcel(tinf) created a lot of these invalid precompile statements. Is this a bug? Expected behavior?

Here's an example of another one: jump-dev/MathOptInterface.jl#1243 (comment)

(See the top of the PR for how I created these files.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's a bug. I haven't yet tried to reproduce this but this looks sufficiently detailed that I expect it to be straightforward. Thanks!!

@codecov
Copy link

codecov bot commented Feb 19, 2021

Codecov Report

Merging #2484 (bb37c9a) into master (f6be919) will decrease coverage by 0.23%.
The diff coverage is 95.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2484      +/-   ##
==========================================
- Coverage   93.69%   93.46%   -0.24%     
==========================================
  Files          43       45       +2     
  Lines        5360     5444      +84     
==========================================
+ Hits         5022     5088      +66     
- Misses        338      356      +18     
Impacted Files Coverage Δ
src/JuMP.jl 82.90% <ø> (ø)
src/precompile.jl 95.00% <95.00%> (ø)
src/_Derivatives/topological_sort.jl 95.12% <0.00%> (-4.88%) ⬇️
src/_Derivatives/coloring.jl 95.31% <0.00%> (-2.50%) ⬇️
src/_Derivatives/sparsity.jl 95.18% <0.00%> (-2.41%) ⬇️
src/_Derivatives/conversion.jl 96.49% <0.00%> (-1.76%) ⬇️
src/constraints.jl 95.02% <0.00%> (-0.50%) ⬇️
src/nlp.jl 92.21% <0.00%> (-0.45%) ⬇️
src/feasibility_checker.jl 100.00% <0.00%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f6be919...bb37c9a. Read the comment docs.

@mlubin
Copy link
Member

mlubin commented Feb 19, 2021

Nice! What about using the bot to keep this list up to date?

@odow
Copy link
Member Author

odow commented Feb 19, 2021

What about using the bot to keep this list up to date?

👍 but there are a few issues to sort first, like the invalid statements.

@blegat
Copy link
Member

blegat commented Feb 19, 2021

It's a bit surprising that

# Before:
# 13.303004 seconds ... 22.66% compilation time
# After:
# 7.823846 seconds ... 21.25% compilation time

So we get 2x speedup by reducing compilation time that was only 22% 🤔

it seems like even though we think Julia is specializing on the bridges, it actually isn't

Scary but interesting :)

@timholy
Copy link
Contributor

timholy commented Feb 19, 2021

The bot doesn't use @snoopi_deep yet, I think. I am probably an outlier, but I tend to use the parcel to show me what benefits from precompilation but then rewrite the precompile file by hand. Example: https://github.com/JuliaImages/ImageCore.jl/blob/master/src/precompile.jl. I find that this avoids having to regenerate the precompile files almost daily (check the history of commits to Plots.jl if you want to see what I mean). And if you've fixed inference problems then a relatively small number of precompile directives suffice to precompile "all the way down," so they aren't all that bad to write or maintain.

@odow odow changed the title WIP: Add precompilation directives Add precompilation directives Mar 22, 2021
@odow odow merged commit 319da3a into master Mar 22, 2021
@odow odow deleted the od/precompile branch March 22, 2021 04:34
@odow odow mentioned this pull request Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants