The nonlinear constrained optimization interface in Optim
assumes that the user can write the optimization problem in the following way.
\[\min_{x\in\mathbb{R}^n} f(x) \quad \text{such that}\\
+l_x \leq \phantom{c(}x\phantom{)} \leq u_x \\
+l_c \leq c(x) \leq u_c.\]
For equality constraints on $x_j$ or $c(x)_j$ you set those particular entries of bounds to be equal, $l_j=u_j$. Likewise, setting $l_j=-\infty$ or $u_j=\infty$ means that the constraint is unbounded from below or above respectively.
We will go through examples on how to use the constraints interface with the interior-point Newton optimization algorithm IPNewton.
Throughout these examples we work with the standard Rosenbrock function. The objective and its derivatives are given by
fun(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
+
+function fun_grad!(g, x)
+g[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
+g[2] = 200.0 * (x[2] - x[1]^2)
+end
+
+function fun_hess!(h, x)
+h[1, 1] = 2.0 - 400.0 * x[2] + 1200.0 * x[1]^2
+h[1, 2] = -400.0 * x[1]
+h[2, 1] = -400.0 * x[1]
+h[2, 2] = 200.0
+end;
To solve a constrained optimization problem we call the optimize
method
optimize(d::AbstractObjective, constraints::AbstractConstraints, initial_x::Tx, method::ConstrainedOptimizer, options::Options)
We can create instances of AbstractObjective
and AbstractConstraints
using the types TwiceDifferentiable
and TwiceDifferentiableConstraints
from the package NLSolversBase.jl
.
We want to optimize the Rosenbrock function in the box $-0.5 \leq x \leq 0.5$, starting from the point $x_0=(0,0)$. Box constraints are defined using, for example, TwiceDifferentiableConstraints(lx, ux)
.
x0 = [0.0, 0.0]
+df = TwiceDifferentiable(fun, fun_grad!, fun_hess!, x0)
+
+lx = [-0.5, -0.5]; ux = [0.5, 0.5]
+dfc = TwiceDifferentiableConstraints(lx, ux)
+
+res = optimize(df, dfc, x0, IPNewton())
* Status: success
+
+ * Candidate solution
+ Final objective value: 2.500000e-01
+
+ * Found with
+ Algorithm: Interior Point Newton
+
+ * Convergence measures
+ |x - x'| = 4.39e-10 ≰ 0.0e+00
+ |x - x'|/|x'| = 8.79e-10 ≰ 0.0e+00
+ |f(x) - f(x')| = 0.00e+00 ≤ 0.0e+00
+ |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00
+ |g(x)| = 1.00e+00 ≰ 1.0e-08
+
+ * Work counters
+ Seconds run: 0 (vs limit Inf)
+ Iterations: 43
+ f(x) calls: 68
+ ∇f(x) calls: 68
+
Like the rest of Optim, you can also use autodiff=:forward
and just pass in fun
.
If we only want to set lower bounds, use ux = fill(Inf, 2)
ux = fill(Inf, 2)
+dfc = TwiceDifferentiableConstraints(lx, ux)
+
+clear!(df)
+res = optimize(df, dfc, x0, IPNewton())
* Status: success (objective increased between iterations)
+
+ * Candidate solution
+ Final objective value: 7.987239e-20
+
+ * Found with
+ Algorithm: Interior Point Newton
+
+ * Convergence measures
+ |x - x'| = 3.54e-10 ≰ 0.0e+00
+ |x - x'|/|x'| = 3.54e-10 ≰ 0.0e+00
+ |f(x) - f(x')| = 2.40e-19 ≰ 0.0e+00
+ |f(x) - f(x')|/|f(x')| = 3.00e+00 ≰ 0.0e+00
+ |g(x)| = 8.83e-09 ≤ 1.0e-08
+
+ * Work counters
+ Seconds run: 0 (vs limit Inf)
+ Iterations: 35
+ f(x) calls: 63
+ ∇f(x) calls: 63
+
An unconstrained problem can be defined either by passing Inf
bounds or empty arrays. Note that we must pass the correct type information to the empty lx
and ux
lx = fill(-Inf, 2); ux = fill(Inf, 2)
+dfc = TwiceDifferentiableConstraints(lx, ux)
+
+clear!(df)
+res = optimize(df, dfc, x0, IPNewton())
+
+lx = Float64[]; ux = Float64[]
+dfc = TwiceDifferentiableConstraints(lx, ux)
+
+clear!(df)
+res = optimize(df, dfc, x0, IPNewton())
* Status: success
+
+ * Candidate solution
+ Final objective value: 5.998937e-19
+
+ * Found with
+ Algorithm: Interior Point Newton
+
+ * Convergence measures
+ |x - x'| = 1.50e-09 ≰ 0.0e+00
+ |x - x'|/|x'| = 1.50e-09 ≰ 0.0e+00
+ |f(x) - f(x')| = 1.80e-18 ≰ 0.0e+00
+ |f(x) - f(x')|/|f(x')| = 3.00e+00 ≰ 0.0e+00
+ |g(x)| = 7.92e-09 ≤ 1.0e-08
+
+ * Work counters
+ Seconds run: 0 (vs limit Inf)
+ Iterations: 34
+ f(x) calls: 63
+ ∇f(x) calls: 63
+
We now consider the Rosenbrock problem with a constraint on
\[ c(x)_1 = x_1^2 + x_2^2.\]
We pass the information about the constraints to optimize
by defining a vector function c(x)
and its Jacobian J(x)
.
The Hessian information is treated differently, by considering the Lagrangian of the corresponding slack-variable transformed optimization problem. This is similar to how the CUTEst library works. Let $H_j(x)$ represent the Hessian of the $j$th component $c(x)_j$ of the generic constraints. and $\lambda_j$ the corresponding dual variable in the Lagrangian. Then we want the constraint
object to add the values of $H_j(x)$ to the Hessian of the objective, weighted by $\lambda_j$.
The Julian form for the supplied function $c(x)$ and the derivative information is then added in the following way.
con_c!(c, x) = (c[1] = x[1]^2 + x[2]^2; c)
+function con_jacobian!(J, x)
+ J[1,1] = 2*x[1]
+ J[1,2] = 2*x[2]
+ J
+end
+function con_h!(h, x, λ)
+ h[1,1] += λ[1]*2
+ h[2,2] += λ[1]*2
+end;
Note that con_h!
adds the λ
-weighted Hessian value of each element of c(x)
to the Hessian of fun
.
We can then optimize the Rosenbrock function inside the ball of radius $0.5$.
lx = Float64[]; ux = Float64[]
+lc = [-Inf]; uc = [0.5^2]
+dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
+ lx, ux, lc, uc)
+res = optimize(df, dfc, x0, IPNewton())
* Status: success
+
+ * Candidate solution
+ Final objective value: 2.966216e-01
+
+ * Found with
+ Algorithm: Interior Point Newton
+
+ * Convergence measures
+ |x - x'| = 0.00e+00 ≤ 0.0e+00
+ |x - x'|/|x'| = 0.00e+00 ≤ 0.0e+00
+ |f(x) - f(x')| = 0.00e+00 ≤ 0.0e+00
+ |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00
+ |g(x)| = 7.71e-01 ≰ 1.0e-08
+
+ * Work counters
+ Seconds run: 0 (vs limit Inf)
+ Iterations: 28
+ f(x) calls: 109
+ ∇f(x) calls: 109
+
We can add a lower bound on the constraint, and thus optimize the objective on the annulus with inner and outer radii $0.1$ and $0.5$ respectively.
lc = [0.1^2]
+dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
+ lx, ux, lc, uc)
+res = optimize(df, dfc, x0, IPNewton())
* Status: success
+
+ * Candidate solution
+ Final objective value: 2.966216e-01
+
+ * Found with
+ Algorithm: Interior Point Newton
+
+ * Convergence measures
+ |x - x'| = 0.00e+00 ≤ 0.0e+00
+ |x - x'|/|x'| = 0.00e+00 ≤ 0.0e+00
+ |f(x) - f(x')| = 0.00e+00 ≤ 0.0e+00
+ |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00
+ |g(x)| = 7.71e-01 ≰ 1.0e-08
+
+ * Work counters
+ Seconds run: 0 (vs limit Inf)
+ Iterations: 34
+ f(x) calls: 158
+ ∇f(x) calls: 158
+
Note that the algorithm warns that the Initial guess is not an interior point. IPNewton
can often handle this, however, if the initial guess is such that c(x) = u_c
, then the algorithm currently fails. We may fix this in the future.
The following example illustrates how to add an additional constraint. In particular, we add a constraint function
\[ c(x)_2 = x_2\sin(x_1)-x_1\]
function con2_c!(c, x)
+ c[1] = x[1]^2 + x[2]^2 ## First constraint
+ c[2] = x[2]*sin(x[1])-x[1] ## Second constraint
+ c
+end
+function con2_jacobian!(J, x)
+ # First constraint
+ J[1,1] = 2*x[1]
+ J[1,2] = 2*x[2]
+ # Second constraint
+ J[2,1] = x[2]*cos(x[1])-1.0
+ J[2,2] = sin(x[1])
+ J
+end
+function con2_h!(h, x, λ)
+ # First constraint
+ h[1,1] += λ[1]*2
+ h[2,2] += λ[1]*2
+ # Second constraint
+ h[1,1] += λ[2]*x[2]*-sin(x[1])
+ h[1,2] += λ[2]*cos(x[1])
+ # Symmetrize h
+ h[2,1] = h[1,2]
+ h
+end;
We generate the constraint objects and call IPNewton
with initial guess $x_0 = (0.25,0.25)$.
x0 = [0.25, 0.25]
+lc = [-Inf, 0.0]; uc = [0.5^2, 0.0]
+dfc = TwiceDifferentiableConstraints(con2_c!, con2_jacobian!, con2_h!,
+ lx, ux, lc, uc)
+res = optimize(df, dfc, x0, IPNewton())
* Status: success
+
+ * Candidate solution
+ Final objective value: 1.000000e+00
+
+ * Found with
+ Algorithm: Interior Point Newton
+
+ * Convergence measures
+ |x - x'| = 6.90e-10 ≰ 0.0e+00
+ |x - x'|/|x'| = 3.55e+08 ≰ 0.0e+00
+ |f(x) - f(x')| = 1.38e-09 ≰ 0.0e+00
+ |f(x) - f(x')|/|f(x')| = 1.38e-09 ≰ 0.0e+00
+ |g(x)| = 2.00e+00 ≰ 1.0e-08
+
+ * Work counters
+ Seconds run: 0 (vs limit Inf)
+ Iterations: 29
+ f(x) calls: 215
+ ∇f(x) calls: 215
+
Below follows a version of the program without any comments. The file is also available here: ipnewton_basics.jl
using Optim, NLSolversBase #hide
+import NLSolversBase: clear! #hide
+
+fun(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
+
+function fun_grad!(g, x)
+g[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
+g[2] = 200.0 * (x[2] - x[1]^2)
+end
+
+function fun_hess!(h, x)
+h[1, 1] = 2.0 - 400.0 * x[2] + 1200.0 * x[1]^2
+h[1, 2] = -400.0 * x[1]
+h[2, 1] = -400.0 * x[1]
+h[2, 2] = 200.0
+end;
+
+x0 = [0.0, 0.0]
+df = TwiceDifferentiable(fun, fun_grad!, fun_hess!, x0)
+
+lx = [-0.5, -0.5]; ux = [0.5, 0.5]
+dfc = TwiceDifferentiableConstraints(lx, ux)
+
+res = optimize(df, dfc, x0, IPNewton())
+
+ux = fill(Inf, 2)
+dfc = TwiceDifferentiableConstraints(lx, ux)
+
+clear!(df)
+res = optimize(df, dfc, x0, IPNewton())
+
+lx = fill(-Inf, 2); ux = fill(Inf, 2)
+dfc = TwiceDifferentiableConstraints(lx, ux)
+
+clear!(df)
+res = optimize(df, dfc, x0, IPNewton())
+
+lx = Float64[]; ux = Float64[]
+dfc = TwiceDifferentiableConstraints(lx, ux)
+
+clear!(df)
+res = optimize(df, dfc, x0, IPNewton())
+
+con_c!(c, x) = (c[1] = x[1]^2 + x[2]^2; c)
+function con_jacobian!(J, x)
+ J[1,1] = 2*x[1]
+ J[1,2] = 2*x[2]
+ J
+end
+function con_h!(h, x, λ)
+ h[1,1] += λ[1]*2
+ h[2,2] += λ[1]*2
+end;
+
+lx = Float64[]; ux = Float64[]
+lc = [-Inf]; uc = [0.5^2]
+dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
+ lx, ux, lc, uc)
+res = optimize(df, dfc, x0, IPNewton())
+
+lc = [0.1^2]
+dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
+ lx, ux, lc, uc)
+res = optimize(df, dfc, x0, IPNewton())
+
+function con2_c!(c, x)
+ c[1] = x[1]^2 + x[2]^2 ## First constraint
+ c[2] = x[2]*sin(x[1])-x[1] ## Second constraint
+ c
+end
+function con2_jacobian!(J, x)
+ # First constraint
+ J[1,1] = 2*x[1]
+ J[1,2] = 2*x[2]
+ # Second constraint
+ J[2,1] = x[2]*cos(x[1])-1.0
+ J[2,2] = sin(x[1])
+ J
+end
+function con2_h!(h, x, λ)
+ # First constraint
+ h[1,1] += λ[1]*2
+ h[2,2] += λ[1]*2
+ # Second constraint
+ h[1,1] += λ[2]*x[2]*-sin(x[1])
+ h[1,2] += λ[2]*cos(x[1])
+ # Symmetrize h
+ h[2,1] = h[1,2]
+ h
+end;
+
+x0 = [0.25, 0.25]
+lc = [-Inf, 0.0]; uc = [0.5^2, 0.0]
+dfc = TwiceDifferentiableConstraints(con2_c!, con2_jacobian!, con2_h!,
+ lx, ux, lc, uc)
+res = optimize(df, dfc, x0, IPNewton())
+
+# This file was generated using Literate.jl, https://github.com/fredrikekre/Literate.jl
This page was generated using Literate.jl.