Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pwqbounds not being respected #135

Closed
fergu opened this issue Apr 17, 2015 · 17 comments
Closed

pwqbounds not being respected #135

fergu opened this issue Apr 17, 2015 · 17 comments

Comments

@fergu
Copy link

fergu commented Apr 17, 2015

I have been working on a problem where each parameter is very different in magnitude. To improve behavior of the optimizer, I have remapped the parameters to 0.0 - 1.0 space and just expand them to their real values within the fitness function. This works well enough.

However, I am noticing that the library will ignore the bounds I impose on them to keep the parameters between 0 and 1.

I largely just followed the tutorial in the wiki page. I start with my initial guess:

    double xGuess[dim] = { 
        0.5,
        0.5,
        0.5,
        0.5,
        0.5,
        0.5,
    };

    std::vector<double> x0(xGuess,xGuess+sizeof(xGuess)/sizeof(double));

I create my lower and upper bounds as

    double lbounds[dim] = {0.0,0.0,0.0,0.0,0.0,0.0};
    double ubounds[dim] = {1.0,1.0,1.0,1.0,1.0,1.0}; // arrays for lower and upper parameter bounds, respectively

and then create the geno/pheno transform

    GenoPheno<pwqBoundStrategy> gp(lbounds,ubounds,dim); // genotype / phenotype transform associated to bounds.

and apply it to my CMAParameter object:

    CMAParameters<GenoPheno<pwqBoundStrategy>> cmaparams(dim,&x0.front(),0.3,lambda,5,gp);

then I finally call the optimization routine:

    CMASolutions cmasols = cmaes<GenoPheno<pwqBoundStrategy>>(ff,cmaparams,pfunc,nullptr,CMASolutions(),plotf);

The expansion within the fitness function is just a simple (constant) linear expansion based on real upper and lower bounds which do not change.

The optimization runs just fine, but when I check the resulting parameter distribution, I get:

best solution: best solution => f-value=957.763 / fevals=856 / sigma=0.0149694 / iter=107 / elaps=86197819ms / x=  0.340287   0.400212 -0.0641709    1.08493 -0.0562272   0.780849
============================
optimization took 86197.8 seconds
Run Status: [Success] The optimization has converged

3 of the 6 parameters have gone outside of the bounds I tried to enforce. In some cases the departure from my requested boundaries is greater, with parameters going to +/- 2 or greater.

I do not see any warnings from cmaes regarding the pwq transform (or any warnings at all) as I have seen in the past when I have misconfigured the routine.

Obviously this isn't expected behavior. Is it possible that the setup routine for the bounds changed since the wiki page has last been updated, and I am just misconfiguring here?

@beniz
Copy link
Collaborator

beniz commented Apr 17, 2015

You should never change the parameters values from within the objective function, there's the open issue #94 about not letting that happen anymore. Instead you should let the algorithm search the bounded space for you.

Instead of using your own linear transform, use the built-in automated linear scaling of your parameters, see the relevant section in documentation, here: https://github.com/beniz/libcmaes/wiki/Using-parameter-space-transforms-known-as-genotype-phenotype-transforms

Also, when printing out the solution object, be sure to pass the gp object to the print function:

cmasols.print(std::cout,false,cmaparams.get_gp());

Let me know if this fails to get you moving forward.

@fergu
Copy link
Author

fergu commented Apr 17, 2015

It may be worth noting that I am not changing the actual parameter vector, x, but rather that I am expanding each value (to a new variable) based on some known upper and lower bounds. I.E

double l[dim] = {...}
double u[dim] = {...}

double a = x[0]*(u[0]-l[0])+l[0]
double b = x[1]*(u[1]-l[1])+l[1]
// ...
// We use a,b,c..., not x[0], x[1], x[2]... going forward 

Is this still poor practice?

None the less, I have gone back to using the "real" parameter space, and used the print line that you suggested at the end of the program. I have not observed this behavior in that setup, though I am re-running the problem now that I've gone back to that setup. I really decided to go this "zero to one" route to achieve what I guess the bounding algorithm was already doing, which leaves me with a question.

For my own education, am I understanding properly that when this bounding strategy is applied that the library automatically rescales all of the variables internally such that the step size (sigma), etc, is effectively the same for all of them based on their known bounds? How does one appropriately decide sigma in this case, since it may have a very different value depending on the allowable range for a given parameter (-1 to 1 vs -10k to 10k, for example)

And, I suppose a followup question. I seem to be having a lot of questions/issues just related to my (lack of) understanding of the CMA-ES/library in general. Are there any good papers that explain the mechanics at play and the various parameters/real meanings that go into it without going too deeply into the heavy mathematics? I am reading through the "tutorial" PDF on your page, but I'm curious if there's other papers worth reading to that end.

@beniz
Copy link
Collaborator

beniz commented Apr 18, 2015

Is this still poor practice?

Since the algorithm uses and adapts the (initial) step-size sigma based on the internal state-space, I would tend to say that it is better to avoid this practice.

the library automatically rescales all of the variables internally such that the step size (sigma), etc, is effectively the same for all of them based on their known bounds?

Yes if you use the linScalingStrategy geno/pheno transform in addition to the pwqBoundStrategy. I have observed significant improvement on some problems by using this settings.

How does one appropriately decide sigma in this case

Sorry, this seems to be lacking in the doc on linear scaling. Since with the linScalingStrategy the sensibility is similar across all parameters and by default lies in [0,10], an initial value of sigma in [0.001, 0.1] may be appropriate. The 'initial value' section of https://www.lri.fr/~hansen/cmaes_inmatlab.html#practical has a few additional information.

I am reading through the "tutorial" PDF on your page, but I'm curious if there's other papers worth reading to that end.

This is Nikolaus Hansen's (@nikohansen) page, here: https://www.lri.fr/~hansen/
He is the original author of most of the algorithms in the library.

@fergu
Copy link
Author

fergu commented Apr 20, 2015

Thanks for your insight! I will read further to hopefully get my theoretical understanding of the library to be a bit better.

This has successfully fixed my issue as well. Turns out - as can frequently be the case - it was "user error". Thank you!

@fergu fergu closed this as completed Apr 20, 2015
@shernshiou
Copy link

@kjfergu I am facing a similar problem. Can you guide me on how to limit the parameter boundary? Thanks

@beniz
Copy link
Collaborator

beniz commented Aug 13, 2015

Can you be more specific and give details on what you are doing ? This is the best to help you out resole your issue.

@fergu
Copy link
Author

fergu commented Aug 13, 2015

@shernshiou Make sure you are not doing what I was. It seems you should just use the array of X values passed to you by the CMA-ES routine - you should not attempt to rescale or alter them, etc. This seems to confuse the routine, and can lead to the optimization going out of its bounds.

Instead, I ended up using linScalingStrategy, as suggested above, to achieve the scaling of my parameter space such that a single sigma was appropriate for each.

@shernshiou
Copy link

@kjfergu @beniz OK, I am going with linScalingStrategy too. Thanks.

@Neo-X
Copy link

Neo-X commented Apr 9, 2016

Hello,
I am having a related issue. If I just take the bounds example and multiply the fsphere function output by -1.0 (line 32: return -val;). The result/output is:
best solution: best solution => f-value=-40 / fevals=2500 / sigma=1.26359e-07 / iter=250 / elaps=329ms / x=2.15 2.15 2.15 2.15 2.15 2.15 2.15 2.15 2.15 2.15
optimization took 0.329 seconds

If I add linScalingStrategy it gets worse.
best solution: best solution => f-value=-40 / fevals=2470 / sigma=4.98785e-07 / iter=247 / elaps=334ms / x=10.55 10.55 10.55 10.55 10.55 10.55 10.55 10.55 10.55 10.55
optimization took 0.334 seconds

I thought the bounds are supposed to keep the function between -2.0 and 2.0.

@beniz
Copy link
Collaborator

beniz commented Apr 10, 2016

Hi, thanks for catching this: the sample code prints the internal candidate vector. I'll commit a change, but in the meantime you can simply change line 50 (https://github.com/beniz/libcmaes/blob/master/examples/sample-code-bounds.cc#L50) into

std::cout << "best solution: ";
cmasols.print(std::cout,0,gp);
std::cout << std::endl;

Output should now say:

best solution: best solution => f-value=-40 / fevals=2560 / sigma=1.47265e-07 / iter=256 / elaps=11ms / x=2 2 2 2 2 2 2 2 2 2

@Neo-X
Copy link

Neo-X commented Apr 10, 2016

Thanks!

@intractabilis
Copy link

I am not changing any parameters, yet they are out of boundaries. Moreover, it happens on the very first call to the fitting function. What am I doing wrong?

#include <array>
#include <cstdio>

#include <cmaes.h>

using namespace libcmaes;

int main(int argc, char *argv[])
{
    auto params = std::array<double, 7>{
        0.948389,  4.587389,  0.993961, -0.166667,  0.993961, -0.166667,  4.587389};
    auto lb = std::array<double, 7>{
        0.500000,  4.337389,  0.500000, -0.500000,  0.500000, -0.450000,  4.374889};
    auto ub = std::array<double, 7>{
        2.000000,  4.712389,  2.000000,  0.000000,  2.000000,  0.000000,  4.712389};

    using GP         = GenoPheno<pwqBoundStrategy>;
    using Parameters = CMAParameters<GP>;

    auto sigma     = 0.001;
    auto gp        = GP{&lb[0], &ub[0], 7};
    auto cmaparams = Parameters{
        7
      , &params[0]
      , sigma
      , -1
      , 0
      , gp};
    auto costFun = FitFunc{[](const double *a, const int) {
        printf("parameters: [");
        for (auto i = 0; 7 > i; ++i) {
            if (0 != i) { printf(", "); }
            printf("%9.6f", a[i]);
        }
        printf("]\n");
        throw std::exception();
        return 0.0;
    }};
    try {
        auto cmasols = cmaes<GP>(costFun, cmaparams);
    } catch (...) {
    }
}

The first input to the fitting function is:

[ 0.949417,  0.947865,  0.950688,  0.948889,  0.947882,  0.948647,  0.948833]

Which clearly is outside the boundaries.

@intractabilis
Copy link

intractabilis commented Jun 26, 2021

I was able to reproduce this with a simpler set of data:

    auto params = std::array<double, 2>{0.5, -0.5};
    auto lb     = std::array<double, 2>{0.0, -1.0};
    auto ub     = std::array<double, 2>{1.0,  0.0};

The first input to the fitting function is

[ 0.498514,  0.500019]

@intractabilis
Copy link

I found the bug. I'll file a new issue.

@beniz
Copy link
Collaborator

beniz commented Jun 27, 2021

This is no bug but because you are printing out the values used internally by cmaes, these are rescaled versions of the input values, see https://github.com/CMA-ES/libcmaes/wiki/Defining-and-using-bounds-on-parameters and related documentation.

FYI, the inbound and outbound transforms are defined here: https://github.com/CMA-ES/libcmaes/blob/master/src/pwq_bound_strategy.cc

@intractabilis
Copy link

This is no bug but because you are printing out the values used internally by cmaes, these are rescaled versions of the input values, see https://github.com/CMA-ES/libcmaes/wiki/Defining-and-using-bounds-on-parameters and related documentation.

This has nothing to do neither with rescaling nor with internal cmaes values. I am printing values that are the input to the fitting function. They are obviously NOT internal. You can check the code. CMAStrategy::optimize calls evalf with scaled back to the real space parameters here (you can see the second parameter was transformed):

while (!stop()) {
    dMat candidates = askf();
    evalf(candidates, eostrat<TGenoPheno>::_parameters._gp.pheno(candidates));
    tellf();
    eostrat<TGenoPheno>::inc_iter();
    std::chrono::time_point<std::chrono::system_clock> tstop =
        std::chrono::system_clock::now();
    eostrat<TGenoPheno>::_solutions._elapsed_last_iter =
        std::chrono::duration_cast<std::chrono::milliseconds>(tstop - tstart).count();
    tstart = std::chrono::system_clock::now();
}

evalf is a pointer to ESOStrategy::eval and the latter calls the fitting function with phenocandidates:

#pragma omp parallel for if (_parameters._mt_feval)
for (int r = 0; r < candidates.cols(); r++) {
    _solutions._candidates.at(r).set_x(candidates.col(r));
    _solutions._candidates.at(r).set_id(r);
    if (phenocandidates.size())
        _solutions._candidates.at(r).set_fvalue(
            _func(phenocandidates.col(r).data(), candidates.rows()));
    else
        _solutions._candidates.at(r).set_fvalue(
            _func(candidates.col(r).data(), candidates.rows()));
}

This is no bug

It is a bug, because the code above (both branches for transformed and not transformed parameters) assumes that the layout of your matrices is column major. However, in fact the code nowhere specified the layout, it just relies on what Eigen has by default, which can be whatever.

FYI, the inbound and outbound transforms are defined here: https://github.com/CMA-ES/libcmaes/blob/master/src/pwq_bound_strategy.cc

You will be surprised, but I know. This was the first place I looked into, when I was debugging your library.

@beniz
Copy link
Collaborator

beniz commented Jun 28, 2021

Ah my bad @intractabilis I overlooked your question. Of course the values to the function are in pheno space. And good catch as well.
Eigen default to column-major, https://eigen.tuxfamily.org/dox/group__TopicStorageOrders.html but I guess it should protect against an external program that would predefine otherwise.

Please feel free to PR the fix you proposed in #224.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants