Finding the minimizer to high precision #1120

kbarros · 2024-11-14T14:03:24Z

Convergence of optimize typically finds the cost function minimum to full numerical precision $\epsilon$ (about 16 digits), but the minimizer, and the gradient of the cost function, will be converged to a much reduced precision of $\sqrt{\epsilon}$ (about 8 digits). This makes sense when considering a quadratic cost function of the form $f(x) = -c x^2$; convergence in $x$ will be much slower than convergence in $f$. Optim will by default terminate when the value of $f$ converges, even if g_tol is set to a very tight tolerance.

For some applications, it's important to have the true minimizer $x$ to high precision. I discovered that this is possible by setting two fields in Options, namely f_reltol=NaN, f_abstol=NaN. With both set, the ConjugateGradient method will ignore convergence of $f$ in its termination condition, and will then respect a very tight g_tol. And these options seem to be correctly propagated to the internal line search that conjugate gradient uses!

Is setting f_reltol=NaN, f_abstol=NaN an "officially supported" use of Optim?

If so, it seems like f_reltol and f_abstol both need to be mentioned in the documentation (finding them currently requires reading the source code). Note that setting f_tol=NaN alone does not seem to work. It would also be great to have test cases to check this use case.

The text was updated successfully, but these errors were encountered:

pkofod · 2024-11-14T20:25:58Z

Yes it should be mentioned. The whole section on options and tolerances need to be updated. See also #1102 . Currently, Optim's defaults are 0 for the f_tols and if you only set one to NaN, the other will still "converge" (or better, be satisfied) because in your case it seems like you've run out of precision in the objective, but you can still take steps that make the gradient elements smaller.

So yes, you can set the two tolerances to NaN or to negative values (I believe). Do you need this level of precision? I'm am curious, because if you need that level of precision have you considered using another number type? In my other package NLSolvers.jl I support DoubleFloats

julia> using NLSolvers

julia> using DoubleFloats

julia> function fdouble(x)
               fx = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
               return fx
           end
fdouble (generic function with 1 method)



julia>     function fgdouble(G, x)
               fx = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2

               G1 = -2 * (1 - x[1]) - 400 * (x[2] - x[1]^2) * x[1]
               G2 = 200 * (x[2] - x[1]^2)
               G = [G1, G2]
               return fx, G
           end
fgdouble (generic function with 1 method)

julia>     f_obj = OptimizationProblem(
               ScalarObjective(
                   f=fdouble,
                   fg=fgdouble,
               ),
           )
OptimizationProblem{ScalarObjective{typeof(fdouble), Nothing, typeof(fgdouble), Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing, NLSolvers.Euclidean{Tuple{0}}, Nothing, NLSolvers.InPlace, Nothing}(ScalarObjective{typeof(fdouble), Nothing, typeof(fgdouble), Nothing, Nothing, Nothing, Nothing, Nothing}(fdouble, nothing, fgdouble, nothing, nothing, nothing, nothing, nothing), nothing, NLSolvers.Euclidean{Tuple{0}}(), nothing, NLSolvers.InPlace(), nothing)

julia>     res =
               res = solve(
                   f_obj,
                   Double64.([1, 2]),
                   LineSearch(GradientDescent(Inverse())),
                   OptimizationOptions(; g_abstol = 1e-32, maxiter = 100000),
               )
Results of minimization

* Algorithm:
  Gradient Descent with backtracking (no interp)

* Candidate solution:
  Final objective value:    6.94e-65
  Final gradient norm:      9.91e-33

  Initial objective value:  1.00e+02
  Initial gradient norm:    4.00e+02

* Stopping criteria
  |x - x'|              = 2.09e-35 <= 0.00e+00 (false)
  |x - x'|/|x|          = 1.48e-35 <= 0.00e+00 (false)
  |f(x) - f(x')|        = 1.11e-67 <= -Inf (false)
  |f(x) - f(x')|/|f(x)| = 1.60e-03 <= -Inf (false)
  |g(x)|                = 9.91e-33 <= 1.00e-32 (true)
  |g(x)|/|g(x₀)|        = 2.48e-35 <= 0.00e+00 (false)

* Work counters
  Seconds run:   4.82e-01
  Iterations:    88677


julia>     res =
               res = solve(
                   f_obj,
                   Double64.([1, 2]),
                   LineSearch(BFGS(Inverse())),
                   OptimizationOptions(; g_abstol = 1e-32, maxiter = 100000),
               )
Results of minimization

* Algorithm:
  Inverse BFGS with backtracking (no interp)

* Candidate solution:
  Final objective value:    1.47e-70
  Final gradient norm:      1.47e-34

  Initial objective value:  1.00e+02
  Initial gradient norm:    4.00e+02

* Stopping criteria
  |x - x'|              = 1.98e-30 <= 0.00e+00 (false)
  |x - x'|/|x|          = 1.40e-30 <= 0.00e+00 (false)
  |f(x) - f(x')|        = 9.86e-60 <= -Inf (false)
  |f(x) - f(x')|/|f(x)| = 1.00e+00 <= -Inf (false)
  |g(x)|                = 1.47e-34 <= 1.00e-32 (true)
  |g(x)|/|g(x₀)|        = 3.67e-37 <= 0.00e+00 (false)

* Work counters
  Seconds run:   1.19e-04
  Iterations:    20

kbarros · 2024-11-15T01:40:05Z

Hi @pkofod, thanks for your reply. It's great to know that setting f_reltol=NaN, f_abstol=NaN together can be officially supported. I've incorporated this approach into our package Sunny, and it works great:
SunnySuite/Sunny.jl#333

Do you need this level of precision? I'm am curious, because if you need that level of precision have you considered using another number type?

Our application is quantum magnetism. The physical observables (excitation energies, associated with scattering of neutron or x-ray beams) can depend somewhat sensitively on the accuracy of the minimizer "x" (magnetic configuration) not just on the accuracy of the "f" function (ground state energy). Now we can easily get "x" to 12 digits of accuracy, which is enough. We're also speed sensitive, so it's nice to be able to use Float64.

pkofod · 2024-11-15T06:15:27Z

Great. The actionable item here is to improve documentation then.

kbarros · 2024-11-15T14:05:14Z

Yes, correct.

Also, could you please consider including a test like this to make sure that this feature continues to work in future releases?

import Optim
import Test: @test

rosenbrock(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2

# Need high-precision gradient to impose tight tolerances below
function g_rosenbrock!(g, x)
    g[1] = 2 * (1.0 - x[1]) * (-1) + 2 * 100 * (x[2] - x[1]^2) * (-2 * x[1])
    g[2] = 2 * 100 * (x[2] - x[1]^2)
end

# To set tight tolerance on gradient g, need to disable any check on f
options = Optim.Options(g_tol=1e-10, f_reltol=NaN, f_abstol=NaN)
result = Optim.optimize(rosenbrock, g_rosenbrock!, zeros(2), Optim.ConjugateGradient(), options)
@test Optim.g_residual(result) < 1e-10

# To set tight tolerance on x, need to also disable default gradient tolerance, g_tol=1e-8
options = Optim.Options(x_tol=1e-10, g_tol=NaN, f_reltol=NaN, f_abstol=NaN)
result = Optim.optimize(rosenbrock, g_rosenbrock!, zeros(2), Optim.ConjugateGradient(), options)
@test Optim.x_abschange(result) < 1e-10

kbarros mentioned this issue Nov 14, 2024

Run minimize_energy! until gradient fully converges SunnySuite/Sunny.jl#333

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finding the minimizer to high precision #1120

Finding the minimizer to high precision #1120

kbarros commented Nov 14, 2024 •

edited

Loading

pkofod commented Nov 14, 2024 •

edited

Loading

kbarros commented Nov 15, 2024

pkofod commented Nov 15, 2024

kbarros commented Nov 15, 2024 •

edited

Loading

Finding the minimizer to high precision #1120

Finding the minimizer to high precision #1120

Comments

kbarros commented Nov 14, 2024 • edited Loading

pkofod commented Nov 14, 2024 • edited Loading

kbarros commented Nov 15, 2024

pkofod commented Nov 15, 2024

kbarros commented Nov 15, 2024 • edited Loading

kbarros commented Nov 14, 2024 •

edited

Loading

pkofod commented Nov 14, 2024 •

edited

Loading

kbarros commented Nov 15, 2024 •

edited

Loading