Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

moving irace parameters implementation to Paradox #377

Open
1 of 4 tasks
MLopez-Ibanez opened this issue Nov 18, 2022 · 5 comments
Open
1 of 4 tasks

moving irace parameters implementation to Paradox #377

MLopez-Ibanez opened this issue Nov 18, 2022 · 5 comments

Comments

@MLopez-Ibanez
Copy link

MLopez-Ibanez commented Nov 18, 2022

Description

I'm considering replacing irace custom parameter representation with paradox. This will benefit irace since I expect your implementation to be of higher quality and it will benefit mlr3 and any other package that uses both paradox and irace by avoiding awkward conversions between types.

However, there are a few things that irace would need before being able to make the move (incomplete list!):

  • Sampling of categorical parameters with given probabilities. Is this something that paradox would be willing to implement?
  • Irace supports 'ordinal' parameters, which are categorical parameters encoded as integers. This is a convenience for users and they could be simply implemented as integers but it is a very useful convenience. Is this something supported by paradox in any shape or form?
  • Dependent bounds: In irace, the domain of an integer parameter can be (1, "param1*2") or (0,"min(param1,param2)"). This automatically creates dependencies for hierarchical sampling. I don't see any examples of this in the documentation but perhaps it is possible.
  • Switches/labels/description: parameters in irace carry extra information (switch) that is used to construct command-line calls or invoke functions with precise argument names and various tricks (see example below).

Reproducible example

library(irace)
parameters.table <- '
 # name       switch           type  values               [conditions (using R syntax)]
 algorithm    "--"             c     (as,mmas,eas,ras,acs)
 localsearch  "--localsearch " c     (0, 1, 2, 3)
 alpha        "--alpha "       r     (0.00, 5.00)
 beta         "--beta "        r     (0.00, 10.00)
 rho          "--rho  "        r     (0.01, 1.00)
 ants         "--ants "        i,log (5, 100)
 q0           "--q0 "          r     (0.0, 1.0)           | algorithm == "acs"
 rasrank      "--rasranks "    i     (1, "min(ants, 10)") | algorithm == "ras"
 elitistants  "--elitistants " i     (1, ants)            | algorithm == "eas"
 nnls         "--nnls "        i     (5, 50)              | localsearch %in% c(1,2,3)
 dlb          "--dlb "         c     (0, 1)               | localsearch %in% c(1,2,3)
 '
parameters <- readParameters(text=parameters.table)
str(parameters)

The above shows an advanced use of the switch for algorithm, various conditions and dependent bounds.

@mb706
Copy link
Contributor

mb706 commented Nov 20, 2022

  1. Sampling with given probabilities: paradox does sampling using the Sampler-class (documented to some degree here), sampling with different probabilities should be easy to implement there by subclassing.
  2. Ordinal parameters: Maybe this is already working the way it should, e.g. ps(x = p_fct(1:3))? The factor levels are strings ("1", "2", "3"), but since R auto-casts values to strings whenever they are involved they behave similar to integers. E.g.
    p <- ps(x = p_fct(1:3))
    p$levels$x  # character-type
    #> [1] "1" "2" "3"
    2 %in% p$levels$x  # behaves as if it were integers here
    #> TRUE
    But maybe I am not understanding your use case directly? E.g. do you need to be able to refer to categories both by name and by index/number?
  3. Dependent bounds: Our $deps only encode whether a component is "valid" (i.e. whether its value should influence the outcome at all) whenever another component has a certain value. What you are describing would currently be solved using the $trafo mechanism, which transforms parameters after sampling:
    p <- ps(param1 = p_dbl(0, 1), param2 = p_dbl(0, 1))
    p$trafo <- function(x, param_set) { x$param2 <- x$param2 * x$param1 * 2 ; x }
    sampled <- generate_design_random(p, 100)$transpose()
    plot(data.table::rbindlist(sampled))
    image
    In this case the ParamSet describes the way things are sampled and then transforms them, notice that both param1 and param2 range from 0 to 1 here. Whatever is being optimized here would, however, accept values of param2 that go up to 2, so it would have a slightly different ParamSet: p_domain <- ps(param1 = p_dbl(0, 1), param2 = p_dbl(0, 2)). In bbotk we therefore make the distinction between "search space" (p in this example) and "domain" of a function (p_domain here). The optimizer would only see the p ParamSet and would not need to worry about transformations or weird domain bounds; it can optimize in a cartesian product space. This also covers cases such as sampling from a 2-dimensional manifold in 3D, since the $trafo can also create new parameter components. We could probably add hierarchical dependencies of parameter bounds in paradox, but maybe the trafo-mechanism would also work for you? Are the variable parameter-bounds actually used in practice?
  4. Meta-data about parameters: Would not be a problem to add these.

@mb706
Copy link
Contributor

mb706 commented Nov 20, 2022

P.S. I would be very happy if we can integrate paradox with irace :-)

@MLopez-Ibanez
Copy link
Author

@mb706 Thanks this is very useful!

  1. Sampling with given probabilities: paradox does sampling using the Sampler-class (documented to some degree here), sampling with different probabilities should be easy to implement there by subclassing.

Maybe it helps to explain a bit how irace generates new solutions. At each generation, the best (elite) configurations found so far are used to either define mean and std dev values for sampling numerical parameters from a truncated normal distribution or to define probabilities to sample values for categorical parameters.

I see that Sampler1DCateg already support probabilities, so that problem is solved.

Maybe I can use SamplerHierarchical for implementing the irace sampling by pre-calculating all probabilities and mean/sd values and asking it to sample just 1 point, then repeating those steps for each elite configuration. Does that sound like a good idea?

2. Ordinal parameters: Maybe this is already working the way it should, e.g. `ps(x = p_fct(1:3))`? The factor levels are strings (`"1"`, `"2"`, `"3"`), but since R auto-casts values to strings whenever they are involved they behave similar to integers. E.g.
   But maybe I am not understanding your use case directly? E.g. do you need to be able to refer to categories both by name _and_ by index/number?

Let me explain a bit more. In irace, if you declare a parameter with values ("very-low", "low", "medium", "high", "very-high") of ordinal type, then the values are treated as 1,2,3,4,5 for sampling purposes, that is, we sample values from a normal distribution, then round to the nearest integer and map back to the corresponding string. The goal is that if you have a good configuration with value "high", then we should sample more of "very-high" than "very-low".

3. Dependent bounds: Our `$deps` only encode whether a component is "valid" (i.e. whether its value should influence the outcome at all) whenever another component has a certain value. What you are describing would currently be solved using the `$trafo` mechanism, which transforms parameters after sampling:

Ah, OK, we also have something equivalent to $trafo in irace, which is much more general than dependent-domains, so this is useful information.
However, using $trafo for dependent domains seems a bit error-prone, e.g., to produce values within (1,2*param1) as my first example , your code would need to be

p <- ps(param1 = p_dbl(0, 1), param2 = p_dbl(0, 1))
p$trafo <- function(x, param_set) { x$param2 <- 1 + (x$param2 * x$param1 * 2 - 1) ; x }

Would this example param3 = p_dbl(1, min(param1, param2)) be like?

p <- ps(param1 = p_dbl(0, 1), param2 = p_dbl(0, 1), param3 = p_dbl(0,1)
p$trafo <- function(x, param_set) { x$param2 <- 1 + (min(x$param2, x$param1)*x$param3 - 1) ; x }

This seems cumbersome for the user to write but relatively easy for paradox to generate.

We could probably add hierarchical dependencies of parameter bounds in paradox, but maybe the trafo-mechanism would also work for you? Are the variable parameter-bounds actually used in practice?

Yes, people use it (we have some questions about this in the google group but I think the major real-user of this for irace is the DEMIURGE project, who where the ones that implemented it in irace). However, our answer in the past was to ask them to write our equivalent of $trafo with you can specify in our scenario.txt (they just need to write a single function). However, this is not easy for many people.

Remember that almost every user of irace uses it via the command-line by providing a textual representation of parameters without knowing any R. Asking them to write a $trafo function by hand is cumbersome if they can specify what they want directly in the textual representation.

Another example (ACOTSP) is shown here: https://mlopez-ibanez.github.io/irace/reference/readParameters.html#ref-examples

4. Meta-data about parameters: Would not be a problem to add these.

Great! How would this be done?

@MLopez-Ibanez
Copy link
Author

Hi, I'm still interested in this but I need some help to figure out how to implement the above using paradox.

@MLopez-Ibanez
Copy link
Author

I have implemented a paradox-like interface to create parameters here: https://github.com/MLopez-Ibanez/irace/blob/parametersNew/R/parameters.R

With this interface, one can do the following:

digits <- 4L
x <- parametersNew(
       param_cat(name = "algorithm", values = c("as", "mmas", "eas", "ras", "acs"), switch = "--"),
       param_cat(name = "localsearch", values = c("0", "1", "2", "3"), switch = "--localsearch "),
       param_real(name = "alpha", lower = 0.0, upper=5.0, switch = "--alpha ", digits = digits),
       param_real(name = "beta", lower = 0.0, upper = 10.0, switch = "--beta ", digits = digits),
       param_real(name = "rho", lower = 0.01, upper = 1.00, switch = "--rho ", digits = digits),
       param_int(name = "ants", lower = 5, upper = 100, transf = "log", switch = "--ants "),
       param_real(name = "q0", switch = "--q0 ", lower=0.0, upper=1.0, condition = expression(algorithm == "acs")),
       param_int(name = "rasrank", switch = "--rasranks ", lower=1, upper=quote(min(ants, 10)), condition = 'algorithm == "ras"'),
       param_int(name = "elitistants", switch = "--elitistants ", lower=1, upper=expression(ants), condition = 'algorithm == "eas"'),
       param_int(name = "nnls", switch = "--nnls ", lower = 5, upper = 50, condition = expression(localsearch %in% c(1,2,3))),
       param_cat(name = "dlb",  switch = "--dlb ", values = c(0,1), condition = "localsearch %in% c(1,2,3)"))
})

Can the implementation of the above interface be completely replaced by paradox?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants