Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[suggestion] Implement more algorithms #15

Closed
3 of 14 tasks
gaocegege opened this issue Apr 7, 2018 · 11 comments
Closed
3 of 14 tasks

[suggestion] Implement more algorithms #15

gaocegege opened this issue Apr 7, 2018 · 11 comments

Comments

@gaocegege
Copy link
Member

gaocegege commented Apr 7, 2018

katib has a extensible architecture and three search algorithms thanks to YujiOshima@:

  • vizier-suggestion-random
  • vizier-suggestion-grid
  • vizier-suggestion-hyperband

And we could implement more algorithms based on the arch. It helps us to support more scenarios.

ref https://github.com/tobegit3hub/advisor#algorithms

  • Random Search Algorithm
  • 2x Random Search Algorithm
  • Grid Search Algorithm
  • Baysian Optimization
  • Gaussian Process Bandit
  • Batched Gaussian Process Bandits
  • SMAC Algorithm
  • CMA-ES Algorithm
  • No Early Stop Algorithm
  • Early Stop First Trial Algorithm
  • Early Stop Descending Algorithm
  • Performance Curve Stop Algorithm
  • Median Stop Algorithm
  • Latin hypercube sample (LHS)

/cc @ddutta

@libbyandhelen
Copy link
Contributor

I am implementing the Bayesian Optimization algorithm using python. But I have encountered a question. Since I use the one_hot encoding to deal with the categorical parameters, and embed the integer and discrete parameters to continuous space, the final suggestion is different from the values which need to be used for training. For example:
this is the study config:

configs=[
                    api_pb2.ParameterConfig(
                        name="param1",
                        parameter_type=api_pb2.INT,
                        feasible=api_pb2.FeasibleSpace(max="5", min="1", list=[]),
                    ),
                    api_pb2.ParameterConfig(
                        name="param2",
                        parameter_type=api_pb2.CATEGORICAL,
                        feasible=api_pb2.FeasibleSpace(max=None, min=None, list=["cat1", "cat2", "cat3"])
                    ),
                    api_pb2.ParameterConfig(
                        name="param3",
                        parameter_type=api_pb2.DISCRETE,
                        feasible=api_pb2.FeasibleSpace(max=None, min=None, list=["3", "2", "6"])
                    ),
                    api_pb2.ParameterConfig(
                        name="param4",
                        parameter_type=api_pb2.DOUBLE,
                        feasible=api_pb2.FeasibleSpace(max="5", min="1", list=[])
                    )
                ],

and this is the intermediate result generated by the algorithm, which need to be used in the following iterations

lowerbound [ 1.  0.  0.  0.  2.  1.]
upperbound [ 5.  1.  1.  1.  6.  5.]
[[ 2.    0.23  0.56  0.77  6.    4.5 ]]

this is the final result which is generated from the intermediate result

parameter_set {
  name: "param1"
  parameter_type: INT
  value: "3"
}
parameter_set {
  name: "param2"
  parameter_type: CATEGORICAL
  value: "cat1"
}
parameter_set {
  name: "param3"
  parameter_type: DISCRETE
  value: "3"
}
parameter_set {
  name: "param4"
  parameter_type: DOUBLE
  value: "3.0"
}

So my question is how can we store this intermediate result. And it would be nice if anyone could tell me the location of the scripts which calls the suggestion services (like generate_trials), so that the workflow is more @gaocegege @ddutta @YujiOshima

@YujiOshima
Copy link
Contributor

YujiOshima commented Apr 11, 2018

@libbyandhelen Cool!
Ideally, the intermediate results should be saved in DB.
But the Katib DB does not have such interface.
So currently suggestions store the intermediate information in their memory.
I understand it's big problem because it makes the services stateful.
SuggestTrial is called from trialIteration in the manager. https://github.com/kubeflow/hp-tuning/blob/master/manager/main.go#L106

@libbyandhelen
Copy link
Contributor

libbyandhelen commented Apr 11, 2018

@YujiOshima Thank you!
Just as you said, I stored all the necessary information in the suggestion service's memory. So the service can run and test independently now. And the testing script uses the Frank's function as an example.
So here are some further questions:

  1. I saw that there are some mysql interfaces in kubeflow/db/interface.go. Are we going to use these to store the data to database?
  2. If I understand correctly, the generate_trials function in suggestion service uses the completed_trials to report the objective value to the service, right?
  3. Do you think it is the time to make a pull request?

@gaocegege
Copy link
Member Author

@libbyandhelen

Welcome to open a WIP PR (work in progress PR) and let us see your work to guarantee that you are in the right way.

@YujiOshima
Copy link
Contributor

@libbyandhelen

I agree with @gaocegege welcome to open PR.

For other questions.

  1. Yes. If we want to develop in python, we need a similar interface for the Mysql.
  2. It's not recommended. I want to make suggestion-services get information of Trials from not completed_trials and running_trials but DB.

@gaocegege
Copy link
Member Author

FYI I added some possible algorithms in #15 (comment)

@libbyandhelen
Copy link
Contributor

I am now trying to add more kernels in Gaussian process and more types of acquisition functions. In this case, the service itself needs some parameters such as kernel type and acquisition function type. Dose the SetSuggestionParameters act as setting the parameters of the service itself?
@YujiOshima

@YujiOshima
Copy link
Contributor

@libbyandhelen Yes, it is. The SuggestionParameters consists of key-value pairs string Name : string Value.
And you will parse their parameters like this https://github.com/kubeflow/hp-tuning/blob/master/suggestion/hyperband_service.go#L92

@libbyandhelen
Copy link
Contributor

@YujiOshima Got it. I just pushed a commit to the pull request with one more GP kernel added and two more acquisition functions added. May I ask what is the next step, and how I should integrate it with the system?

@gaocegege gaocegege added help wanted Extra attention is needed good first issue Good for newcomers labels May 6, 2018
@Franky12
Copy link

Hi guys, exploring katib looking for good first issue, is there something I can start with?

@gaocegege
Copy link
Member Author

Now I think we can close this issue since we have already implemented many algorithms. @Franky12 Thanks for your interest. If you want to contribute to katib, please have a look at our roadmap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants