Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow error when find the value of deeply nested nonlinear expressions #3884

Closed
mitchphillipson opened this issue Nov 15, 2024 · 6 comments · Fixed by #3889
Closed
Labels
Category: Nonlinear Related to nonlinear programming

Comments

@mitchphillipson
Copy link

As the title says, finding the value of a deeply nested nonlinear expression causes a stack overflow error. I've minimized the value of N on my machine.

MWE:

using JuMP
M = Model()
N = 8_000
@variable(M, x[1:N], start = 0)

# This line will fail
value(start_value,prod(sum(x[1:i]) for i in 1:N))

# Works as expected
value(start_value, @expression(M, prod(sum(x[1:i]) for i in 1:N)))

This is not urgent on my end, I moved the value inside the product. I am also refactoring my code to not have this issue.

@odow odow added Type: Bug Category: Nonlinear Related to nonlinear programming labels Nov 15, 2024
@odow
Copy link
Member

odow commented Nov 15, 2024

Issue is the use of recursion in

JuMP.jl/src/nlp_expr.jl

Lines 694 to 722 in 7111683

function _evaluate_expr(
registry::MOI.Nonlinear.OperatorRegistry,
f::Function,
expr::GenericNonlinearExpr,
)
op = expr.head
# TODO(odow): uses private function
if !MOI.Nonlinear._is_registered(registry, op, length(expr.args))
return _evaluate_user_defined_function(registry, f, expr)
end
if length(expr.args) == 1 && haskey(registry.univariate_operator_to_id, op)
arg = _evaluate_expr(registry, f, expr.args[1])
return MOI.Nonlinear.eval_univariate_function(registry, op, arg)
elseif haskey(registry.multivariate_operator_to_id, op)
args = [_evaluate_expr(registry, f, arg) for arg in expr.args]
return MOI.Nonlinear.eval_multivariate_function(registry, op, args)
elseif haskey(registry.logic_operator_to_id, op)
@assert length(expr.args) == 2
x = _evaluate_expr(registry, f, expr.args[1])
y = _evaluate_expr(registry, f, expr.args[2])
return MOI.Nonlinear.eval_logic_function(registry, op, x, y)
else
@assert haskey(registry.comparison_operator_to_id, op)
@assert length(expr.args) == 2
x = _evaluate_expr(registry, f, expr.args[1])
y = _evaluate_expr(registry, f, expr.args[2])
return MOI.Nonlinear.eval_comparison_function(registry, op, x, y)
end
end

@odow
Copy link
Member

odow commented Nov 17, 2024

So, taking another look, it's true that we use recursion in value. But the bigger issue is that you didn't use @expression to build the expression.

If you don't want to use @expression, call flatten! to lift the deeply nested expressions.

using JuMP
N = 8_000
model = Model()
@variable(model, x[1:N], start = 0)
f = prod(sum(x[1:i]) for i in 1:N)
flatten!(f)
value(start_value, f)

This is such a rare case that can be easily worked around that I don't know if it's worth complicating the implementation of _evaluate_expr for.

@odow odow removed the Type: Bug label Nov 17, 2024
@odow
Copy link
Member

odow commented Nov 17, 2024

The other option is of course:

@variable(M, x[1:N], start = 0)
y = value.(start_value, x)
prod(sum(y[1:i]) for i in 1:N)

@mitchphillipson
Copy link
Author

mitchphillipson commented Nov 17, 2024 via email

@odow
Copy link
Member

odow commented Nov 18, 2024

I don't even know if it is a documentation issue.

Let's work through what this line is doing:

value(start_value,prod(sum(x[1:i]) for i in 1:N))

prod(sum(x[1:i]) for i in 1:N) is creating a NonlinearExpr of the product of 8,000 terms, where each term an AffExpr with up to 8,000 terms. Except that prod does not create the n-ary *(args...), but instead works by folding the binary *: prod(x) = foldl(*, x). So we create *(sum(x[1:1]), *(sum(x[1:2], *( ... ))). So we're actually creating something like 7,999 NonlinearExpr and a very unbalanced binary tree.

We then call value(start_value, ...), which is going to recursively evaluate the expression, and will end up calling start_value N - i + 1 times for each x[i] because we don't cache values. Even if we fixed the recession issue, this is a lot of operations!

The alternative is to compute your = start_value.(x), which queries start_value once for each variable and returns a Vector{Float64}. Then your prod has a loop that computes sum(x[1:i]) 8,000 times, and since slices copy, you are creating 8,000 new vectors. So you could be even more efficient with prod(sum(@views(y[1:i])) for i in 1:N). But the output of sum is a Float64, so your prod is then the product of 8,000 Float64 which is okay.

Also how sparse is the start value of x? Can you even compute a product of that many elements to the required precision?

@odow
Copy link
Member

odow commented Nov 22, 2024

I'm tempted to close this as won't fix. I don't know if there is an easy fix for this.

Will be fixed by #3889

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Nonlinear Related to nonlinear programming
Development

Successfully merging a pull request may close this issue.

2 participants