Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: Add the Nelder-Mead algorithm #441

Merged
merged 3 commits into from
Nov 14, 2018

Conversation

QBatista
Copy link
Member

@QBatista QBatista commented Oct 16, 2018

Adds a jitted implementation of the Nelder-Mead algorithm for multivariate optimization.

The two main references are Lagarias et. al (1998) for the algorithm and Singer and Singer (2004) for efficiency.

It might be a good idea to have a common interface for scalar_maximization.py and this multivariate version. Please let me know if you have any suggestions for setting up this interface and choosing
names.

Performance

import quantecon as qe
from numba import njit
import numpy as np

@njit
def rosenbrock(x):
    return -(100 * (x[1] - x[0] ** 2) ** 2 + (1 - x[0])**2)

x0 = np.array([-2, 1]) 
qe.optimize.maximize(rosenbrock, x0, tol_x=1e-20)
%timeit qe.optimize.maximize(rosenbrock, x0, tol_x=1e-20)
106 µs ± 6.83 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
from scipy.optimize import minimize

%timeit minimize(rosenbrock, x0, method='Nelder-Mead')
6.23 ms ± 308 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Close #419

@natashawatkins
Copy link
Member

I agree with having a common interface which can be used for either scalar or multivariate

Maybe we could do something similar to scipy and have both a common interface to the function and individual functions, ie. nelder_mead and brentq, and a function maximize with a method argument

@natashawatkins
Copy link
Member

But if we don't decide for go for a common interface for now, I think we should call this function something more like nelder_mead to differentiate from brent_max

- Jitting is commented out for generating coverage report
@coveralls
Copy link

coveralls commented Oct 17, 2018

Coverage Status

Coverage increased (+0.1%) to 94.571% when pulling 09016f9 on QBatista:multivariate_optim into 3bb4d7d on QuantEcon:master.

@natashawatkins
Copy link
Member

I've tried adding this to the lecture but don't seem to get convergence.. here is a look at the output from this function (stored in solver) vs. the 'brute-force' method we're currently using

screen shot 2018-10-19 at 11 48 23 am

Here is the piece of code just in case you spot something wrong with it:

image

Also why is the bounds parameter set to have a shape of two by default? Is this needed?

Thanks

@QBatista
Copy link
Member Author

The bounds parameter is set this way to get Numba to work without using generated_jit. Would it be possible for you to upload the notebook so that I can have a look at it?

@natashawatkins
Copy link
Member

Is that an issue for maximising with more than 2 arguments? Should we have something that automatically detects the size of params and creates the bounds?

@natashawatkins
Copy link
Member

@QBatista
Copy link
Member Author

It looks like the problem is coming from the algorithm. Both my version and Scipy often seem to cycle. I can sometimes get convergence by tweaking the parameters depending on the JVWorker instance, but this occurs quite rarely. Most other methods in Scipy seem to fail, but TNC sometimes work. However, this also doesn't seem to be very robust -- changing the initial guess causes failure.

@natashawatkins
Copy link
Member

So in the live lecture at the moment, it's using the COBYLA method - do you think we should implement this one in the QuantEcon library?

@QBatista
Copy link
Member Author

Actually, after checking the live lecture, the problem seems to be that the constraints are missing. By passing bounds = np.array([[0., 1.], [0., 1.]]) as in the live lecture, it seems to be converging consistently. The graph is slightly different though. I tried a naive way of encoding s + ϕ <= 1 by changing the objective function to return -np.inf if the constraint is not satisfied, but the algorithm seems to eventually get stuck.

@natashawatkins
Copy link
Member

Is there a way we can add a constraints argument to the function? Also are you finding it quite slow? It seems to be a lot slower than the 'brute force' method

@chrishyland
Copy link

Hey @natashawatkins ,
Just thought to jump in but in a previous iteration I had of going about implementing the bounded constraints just involved shifting a point to its closest boundary if a point is outside the feasible region. This is touched on in here if you scroll towards the end.

@QBatista
Copy link
Member Author

@chrishyland Thanks for commenting on this. From your notebook, I see how you've implemented box constraints such as a<=x_n<=b by setting x_n equal to the closest boundary when the constraint is violated. However, I was reluctant to use this technique because it modifies the algorithm, and instead, I set the value returned by the function to be maximized to be -np.inf when the constraint is violated. However, this didn't seem to work for linear inequality constraints of the form x_1 + ... + x_n <= b. Do you have a better idea for implementing such constraints?

@chrishyland
Copy link

Hi @QBatista ,
I believe that in the other link that the box constraints implementation implementation would be sufficient for most problems and I recall seeing an implementation somewhere (will have to look again) in a paper that uses something like this. I’ve struggled as well to find a linear inequality constraint solution that you mentioned as it appears that the Nelder-Mead algorithm is an unconstrained optimisation problem, so I’m not sure if adding such constraints is possible unless you alter the algorithm? However, I recall that @jstac saying that just having box constraints is sufficient for now for Quantecon’s needs. Sorry I couldn’t be of more help but I’d be happy to discuss things with you if you had any other ideas on what to do.

@QBatista
Copy link
Member Author

QBatista commented Nov 4, 2018

@jstac @natashawatkins Having a common interface for the Nelder-Mead algorithm and brent_max would require choosing a common set of required arguments. However, the Nelder-Mead algorithm requires an initial guess of the solution x0 while brent_max doesn't.

One option would be to follow Scipy, which uses two different interfaces, one dedicated to function minimization with only one variable (called minimize_scalar) and a more general one which handles functions with both one or more variables (called minimize). The algorithms behind both interfaces are different: brent is used in minimize_scalar, while Nelder-Mead is used in minimize.

Another option would be to make the initial guess argument optional and raise an error message if the method used requires an initial guess. Scipy uses a similar idea for its L-BFGS-B method in the minimize interface, where the Jacobian of the function is a keyword argument, but an error message is raised if the Jacobian is not passed because this particular method requires it.

In terms of maintenance, the first option would have a more straightforward implementation, especially since we want the interface to be jitted, and therefore should be easier to maintain.

@oyamad Please let me know if you have any suggestions for designing this interface.

@jstac
Copy link
Contributor

jstac commented Nov 4, 2018

Thanks @QBatista

In option one it's not clear from your discussion how the initial guess issue is resolved. The distinction in SciPy is between multivariate and univariate routines, is it not?

Can you be a bit more specific about the proposed interface?

@QBatista
Copy link
Member Author

QBatista commented Nov 4, 2018

@jstac In Scipy, the methods that do not require an initial guess are methods which can be exclusively applied to functions with only one variable, and therefore, they all fall in scalar_minimization. Meanwhile, the more general methods that can be used for functions with one or more variables all require an initial guess, and are accessed through minimize. Therefore, the minimize interface requires an initial guess to be passed, while minimize_scalar has no option for passing an initial guess. This means that a method which requires an initial guess but can be applied exclusively to functions with only one variable would not fit into any of these two interfaces.

@jstac
Copy link
Contributor

jstac commented Nov 4, 2018

Thanks for the explanation. I guess something analogous would be OK for us.

@natashawatkins
Copy link
Member

Should we just stick to having separate functions for different algorithms? What was the issue regarding generated jit again?

@QBatista
Copy link
Member Author

QBatista commented Nov 6, 2018

That's another option -- as long as we don't have too many methods, we could simply have a set of maximize_"method_name" functions.

@natashawatkins generated_jit makes it possible to write a function that has different implementations depending on the input type. This is relevant here because the set of methods that we can use depends on the type of input function. For example, we cannot use brent_max for functions with more than one variables, so if we pass this type of function, the interface should not allow to call this method. However, using this functionality makes the interface more complex, and therefore, it is more costly to maintain. The issue which I mention above is different though -- the problem is that there is no natural set of common required arguments because some methods require an initial guess while others don't.

@natashawatkins
Copy link
Member

I don't like the idea of having maximize and maximize_scalar because I don't think it's very explicit that there are different algorithms available...

I think we could just call it qe.maximize.brentq etc.

Doesn't this get rid of the generated_jit issue as well?

@QBatista
Copy link
Member Author

QBatista commented Nov 6, 2018

Yes, that's fine too. Just one thing to note: we currently also have root finding methods in qe.optimize and these would not fit in qe.maximize. For example, brentq is a root finding method while brent is a maximization method. SciPy puts both root finding methods and optimization methods in optimize, so we could either put all of them in qe.optimize along with maximization methods, or have both qe.maximize and qe.root_finding (or something similar).

@natashawatkins
Copy link
Member

sorry I meant qe.optimize.brentq

@jstac
Copy link
Contributor

jstac commented Nov 6, 2018

I'm in favor of @natashawatkins's suggestion.

@QBatista
Copy link
Member Author

QBatista commented Nov 6, 2018

Sounds good -- I'll make the changes soon.

@QBatista QBatista changed the title [WIP] FEAT: Add the Nelder-Mead algorithm [MRG] FEAT: Add the Nelder-Mead algorithm Nov 7, 2018
@QBatista
Copy link
Member Author

QBatista commented Nov 7, 2018

This is now ready for review.

@jstac
Copy link
Contributor

jstac commented Nov 7, 2018

Cleverly written, well documented, excellent tests. Great job @QBatista.

@mmcky
Copy link
Contributor

mmcky commented Nov 11, 2018

thanks @QBatista this is looking good. I will attach a ready tag and merge tomorrow.

@mmcky mmcky added the ready label Nov 11, 2018
@mmcky
Copy link
Contributor

mmcky commented Nov 11, 2018

Once this is merged -- I will issue a new release via PyPI

@mmcky mmcky changed the title [MRG] FEAT: Add the Nelder-Mead algorithm FEAT: Add the Nelder-Mead algorithm Nov 14, 2018
@mmcky mmcky merged commit 01ee68d into QuantEcon:master Nov 14, 2018
@natashawatkins
Copy link
Member

@mmcky are you releasing a new version of the library? I might use this in a lecture

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants