Constraints with few terms are (relatively) slow #2817

odow · 2021-12-07T19:29:20Z

Example from discourse: https://discourse.julialang.org/t/optimal-model-creation-and-garbage-collection/72619

julia> using JuMP, Gurobi

julia> function create_model(I, T = 10000)
           model = Model(Gurobi.Optimizer; add_bridges = false)
           set_time_limit_sec(model, 0.001)
           @variable(model, 0 <= x[1:I, 1:T] <= 100)
           @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10)
           @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10)
           @objective(model, Min, sum(x))
           return model
       end
create_model (generic function with 2 methods)

julia> solve_model!(model) = optimize!(model)
solve_model! (generic function with 1 method)

julia> i = 1000;

julia> @time model = create_model(i);
Academic license - for non-commercial use only - expires 2022-01-06
369.139157 seconds (362.98 M allocations: 24.727 GiB, 91.34% gc time)

91% of the time being spent in GC isn't ideal! We should find there this garbage is coming from.

chriscoey · 2021-12-13T19:56:52Z

I'm being a choosy beggar but I'd be curious to see any summaries/charts of where the time/allocations are spent

odow · 2021-12-13T19:59:57Z

A lot of it is creating 20,000,000 ordered dictionaries for the AffExpr in the constraints. I've been going through the MOI side first. Not sure what parts we could optimize in JuMP without adding extra complexity.

chriscoey · 2021-12-13T20:01:22Z

OK thanks! I guess I should keep this in mind because of my initial choice to use JuMP instead of MOI directly in MOIPajarito.

odow · 2021-12-14T01:55:20Z

The ordered dictionary is a big contributor we can't do much about. One long-term option would be to go the Gravity route with constraint templates.

But there are a few other contributors as well:

julia> function variables(I, T)
           model = Model()
           @variable(model, [1:I, 1:T], lower_bound = 0, upper_bound = 100)
           return model
       end
variables (generic function with 1 method)

julia> @benchmark variables(100, 1000)
BenchmarkTools.Trial: 302 samples with 1 evaluation.
 Range (min … max):  13.886 ms … 22.610 ms  ┊ GC (min … max): 0.00% … 24.44%
 Time  (median):     15.300 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   16.549 ms ±  2.364 ms  ┊ GC (mean ± σ):  8.61% ± 11.17%

        ▆█▅█▁▁▃                                                
  ▃▆▄▆▆████████▅▅▄▄▃▃▂▃▁▃▁▁▁▁▁▁▁▁▁▁▁▃▃▂▃▃▃▆▄▃▃▆▅▅▇▄▄▆▄▄▂▂▂▂▂▂ ▄
  13.9 ms         Histogram: frequency by time        21.8 ms <

 Memory estimate: 16.72 MiB, allocs estimate: 200142.

julia> @benchmark variables(1000, 10000)
BenchmarkTools.Trial: 3 samples with 1 evaluation.
 Range (min … max):  1.822 s …    2.065 s  ┊ GC (min … max): 15.53% … 23.95%
 Time  (median):     2.018 s               ┊ GC (median):    21.69%
 Time  (mean ± σ):   1.968 s ± 129.183 ms  ┊ GC (mean ± σ):  20.58% ±  4.36%

  █                                             █          █  
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█ ▁
  1.82 s         Histogram: frequency by time         2.07 s <

 Memory estimate: 1.27 GiB, allocs estimate: 20000163.

julia> function variables_with_names(I, T)
           model = Model()
           @variable(model, 0 <= x[1:I, 1:T] <= 100)
           return model
       end
variables_with_names (generic function with 1 method)

julia> @benchmark variables_with_names(100, 1000)
BenchmarkTools.Trial: 107 samples with 1 evaluation.
 Range (min … max):  37.754 ms … 58.620 ms  ┊ GC (min … max):  0.00% … 28.66%
 Time  (median):     48.214 ms              ┊ GC (median):    17.45%
 Time  (mean ± σ):   46.918 ms ±  5.204 ms  ┊ GC (mean ± σ):  14.53% ±  8.68%

  ▃ ▃▁▃█▁                    ▆▃ ▃ ▃▃▁▃▁▃▃▁ ▃▆▁ ▁▁              
  █▁█████▇▁▄▁▁▄▁▁▁▄▁▁▁▁▇▁▇▇▁▇██▄█▄████████▁███▇██▄▁▁▄▁▁▄▄▁▁▁▄ ▄
  37.8 ms         Histogram: frequency by time        56.9 ms <

 Memory estimate: 46.80 MiB, allocs estimate: 900174.

julia> @benchmark variables_with_names(1000, 10000)
BenchmarkTools.Trial: 1 sample with 1 evaluation.
 Single result which took 9.680 s (41.27% GC) to evaluate,
 with a memory estimate of 4.21 GiB, over 90000231 allocations.

odow · 2021-12-14T02:03:16Z

So the names thing is pretty much just to cost creating a dictionary to store the names. We can't do much better:

function test_names(N)
    x = Dict{Int,String}()
    for i in 1:N
        x[i] = string("x[", i, "]")
    end
    return x
end

function test_names_jump(N)
    model = Model()
    @variable(model, x[1:N])
    return model
end

julia> @time test_names(100_000);
  0.075508 seconds (599.52 k allocations: 31.601 MiB)

julia> @time test_names_jump(100_000);
  0.080933 seconds (700.17 k allocations: 37.647 MiB)

julia> @time test_names(1_000_000);
  1.598591 seconds (6.00 M allocations: 324.562 MiB, 40.13% gc time)

julia> @time test_names_jump(1_000_000);
  1.917196 seconds (7.00 M allocations: 360.841 MiB, 40.85% gc time)

That means a small win for the OP is to use anonymous variables instead:

julia> using JuMP, BenchmarkTools, GLPK

julia> function create_model(I, T = 10000)
           model = Model(GLPK.Optimizer)
           @variable(model, 0 <= x[1:I, 1:T] <= 100)
           @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10)
           @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10)
           @objective(model, Min, sum(x))
           return model
       end
create_model (generic function with 2 methods)

julia> @benchmark create_model(100, 1000)
BenchmarkTools.Trial: 18 samples with 1 evaluation.
 Range (min … max):  247.279 ms … 307.278 ms  ┊ GC (min … max): 17.74% … 33.75%
 Time  (median):     282.037 ms               ┊ GC (median):    27.44%
 Time  (mean ± σ):   280.924 ms ±  15.225 ms  ┊ GC (mean ± σ):  27.40% ±  3.75%

  ▁          ▁      ▁  ▁    ▁ █     ▁█ ▁ ▁▁  ▁ ▁▁             █  
  █▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁█▁▁█▁▁▁▁█▁█▁▁▁▁▁██▁█▁██▁▁█▁██▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  247 ms           Histogram: frequency by time          307 ms <

 Memory estimate: 256.05 MiB, allocs estimate: 3598106.

julia> function create_model_2(I, T = 10000)
           model = Model(GLPK.Optimizer)
           x = @variable(model, [1:I, 1:T], lower_bound = 0, upper_bound = 100)
           @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10)
           @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10)
           @objective(model, Min, sum(x))
           return model
       end
create_model_2 (generic function with 2 methods)

julia> @benchmark create_model_2(100, 1000)

BenchmarkTools.Trial: 24 samples with 1 evaluation.
 Range (min … max):  183.181 ms … 244.146 ms  ┊ GC (min … max): 13.25% … 31.49%
 Time  (median):     214.783 ms               ┊ GC (median):    24.05%
 Time  (mean ± σ):   214.715 ms ±  13.848 ms  ┊ GC (mean ± σ):  24.19% ±  4.50%

                           ▃▃█   █▃                              
  ▇▁▁▁▁▁▁▇▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁███▁▁▇██▁▁▁▇▇▁▁▇▇▁▁▇▁▁▇▁▁▁▁▁▁▁▇▁▁▁▁▇ ▁
  183 ms           Histogram: frequency by time          244 ms <

 Memory estimate: 227.49 MiB, allocs estimate: 2998074.

odow · 2021-12-14T03:18:11Z

So the obvious question here would be:

When creating the @constraint, why do we first create an AffExpr, and then convert it into a MOI.ScalarAffineFunction?

We don't cache the AffExpr, so it plays no role other than to avoid a canonicalization step.

We want to return AffExpr and QuadExpr objects to users when they create them and are visible, but we don't need to if the object will never be returned to the user. It's not always a win; in some cases the cost of a final canonicalization might outweigh the benefit of not constructing the OrderedDict, but we could investigate, at least.

Edit: you could imagine a tag like this to skip the default path:

function create_model_2(I, T = 10000)
   model = Model(GLPK.Optimizer)
   x = @variable(model, [1:I, 1:T], lower_bound = 0, upper_bound = 100)
   @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10, SkipJuMPConstructors())
   @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10, SkipJuMPConstructors())
   @objective(model, Min, sum(x))
   return model
end

joaquimg · 2021-12-14T04:47:15Z

for dicts, we can get a bit with sizehint!

function test_names2(N)
    x = Dict{Int,String}()
    sizehint!(x, N)
    for i in 1:N
        x[i] = string("x[", i, "]")
    end
    return x
end

julia> @benchmark test_names(1_000_000)
BenchmarkTools.Trial: 8 samples with 1 evaluation.
 Range (min … max):  612.429 ms … 878.203 ms  ┊ GC (min … max): 42.82% … 57.87%
 Time  (median):     702.485 ms               ┊ GC (median):    47.26%
 Time  (mean ± σ):   708.265 ms ±  86.590 ms  ┊ GC (mean ± σ):  47.80% ±  5.93%

  █    ██         █        █   █   █                          █
  █▁▁▁▁██▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁█▁▁▁█▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  612 ms           Histogram: frequency by time          878 ms <

 Memory estimate: 324.56 MiB, allocs estimate: 5999543.

julia> @benchmark test_names2(1_000_000)
BenchmarkTools.Trial: 8 samples with 1 evaluation.
 Range (min … max):  510.264 ms … 719.390 ms  ┊ GC (min … max): 23.09% … 49.43%
 Time  (median):     678.483 ms               ┊ GC (median):    45.32%
 Time  (mean ± σ):   644.980 ms ±  78.198 ms  ┊ GC (mean ± σ):  42.50% ±  9.92%

  █        █                         █           █  █  █  █   █
  █▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁█▁▁█▁▁█▁▁█▁▁▁█ ▁
  510 ms           Histogram: frequency by time          719 ms <

 Memory estimate: 293.39 MiB, allocs estimate: 5999496.

odow · 2021-12-14T04:53:20Z

Yeah part of the issue is that we add them one-by-one so we don't have this size information.

odow · 2022-05-03T04:16:16Z

using ProfileView
using JuMP, Gurobi
function main(I, T)
    model = Model(Gurobi.Optimizer)
    set_silent(model)
    @variable(model, x[1:T, 1:I])
    @constraint(model, [i in 1:I, t in 2:T], x[t, i] - x[t-1, i] <= 10)
    @constraint(model, [i in 1:I, t in 2:T], x[t, i] - x[t-1, i] >= -10)
    @objective(model, Min, sum(x))
    set_time_limit_sec(model, 0.0)
    optimize!(model)
    return model
end
@time main(1000, 1000)
@profview main(1000, 1000)

The long red bar in the middle is copy_constraints. The orange bar above that is rehashing the dictionary which maps rows to ConstraintInfo in Gurobi.jl. we could improve that by switching to CleverDict, which would use a vector instead of a dict. The long green bar to the right is GRBaddconstraint
The two orange bars on the left are creating OrderedDict for the JuMP model.
The top red bar on the left is adding variable names

We actually considered the small-dict case a while ago: #1654

odow · 2023-09-07T21:54:47Z

I'm tempted to close this issue. We now have an option to skip the most expensive parts (name creation).

Skipping JuMP-level constructors altogether is too magical. If people are running into performance problems with JuMP, then they should really be using the low-level solver interfaces instead, and even then, they're likely building problems that take a long time to solve and this isn't the bottleneck.

odow · 2023-09-18T21:50:52Z

The original problem is now:

julia> using JuMP, Gurobi

julia> function create_model(I, T = 10000)
           model = Model(Gurobi.Optimizer; add_bridges = false)
           set_time_limit_sec(model, 0.001)
           @variable(model, 0 <= x[1:I, 1:T] <= 100)
           @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10)
           @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10)
           @objective(model, Min, sum(x))
           return model
       end
create_model (generic function with 2 methods)

julia> function create_model2(I, T = 10000)
           model = Model(Gurobi.Optimizer; add_bridges = false)
           set_string_names_on_creation(model, false)
           set_time_limit_sec(model, 0.001)
           @variable(model, 0 <= x[1:I, 1:T] <= 100)
           @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10)
           @constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10)
           @objective(model, Min, sum(x))
           return model
       end
create_model2 (generic function with 2 methods)

julia> i = 1000;

julia> model = nothing

julia> GC.gc()

julia> @time model = create_model(i);
Set parameter TimeLimit to value 0.001
 52.479629 seconds (362.28 M allocations: 24.322 GiB, 43.94% gc time, 3.96% compilation time)

julia> model = nothing

julia> GC.gc()

julia> @time model = create_model2(i);
Set parameter TimeLimit to value 0.001
 33.094410 seconds (300.14 M allocations: 21.726 GiB, 25.91% gc time, 0.55% compilation time)

So I think we can safely close this.

We can always re-open if people have evidence of other large-scale problems where this is a bottleneck.

odow added the Type: Performance label Dec 7, 2021

odow added this to the 1.x milestone Dec 8, 2021

This was referenced Dec 8, 2021

Cache indices and coefficients vectors between constraints jump-dev/GLPK.jl#201

Closed

[Utilities] improve performance of canonical jump-dev/MathOptInterface.jl#1704

Merged

Improve performance of MOI.copy_to jump-dev/GLPK.jl#203

Merged

odow changed the title ~~Large GC pressure without obvious cause~~ Constraints with few terms are (relatively) slow May 3, 2022

nickrobinson251 mentioned this issue May 6, 2022

Document cost of named variables and constraints in Performance Tips #2973

Closed

odow closed this as completed Sep 18, 2023

odow mentioned this issue Feb 23, 2024

Improve performance of String names jump-dev/MathOptInterface.jl#2443

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Constraints with few terms are (relatively) slow #2817

Constraints with few terms are (relatively) slow #2817

odow commented Dec 7, 2021

chriscoey commented Dec 13, 2021

odow commented Dec 13, 2021

chriscoey commented Dec 13, 2021

odow commented Dec 14, 2021

odow commented Dec 14, 2021

odow commented Dec 14, 2021 •

edited

Loading

joaquimg commented Dec 14, 2021

odow commented Dec 14, 2021

odow commented May 3, 2022

odow commented Sep 7, 2023

odow commented Sep 18, 2023

Constraints with few terms are (relatively) slow #2817

Constraints with few terms are (relatively) slow #2817

Comments

odow commented Dec 7, 2021

chriscoey commented Dec 13, 2021

odow commented Dec 13, 2021

chriscoey commented Dec 13, 2021

odow commented Dec 14, 2021

odow commented Dec 14, 2021

odow commented Dec 14, 2021 • edited Loading

joaquimg commented Dec 14, 2021

odow commented Dec 14, 2021

odow commented May 3, 2022

odow commented Sep 7, 2023

odow commented Sep 18, 2023

odow commented Dec 14, 2021 •

edited

Loading