-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Tune API in the Katib SDK #1951
Create Tune API in the Katib SDK #1951
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
/hold Hold until kubeflow/training-operator#1659 is merged. |
Thanks for implementing the new feature! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for awesome feature !!! users could generate katib experiment much easier now :)
I left some personal opinions.
# Import required packages. | ||
import time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why this kind of logic is necesary, but we might find out more friendly way to specify packages by users.
However, as a minimal feature, this is ok if we could instruct users that they couldn't make a mistake to miss required packages in the objective function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but I think Kubeflow Pipelines follows the same way: https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/#building-python-function-based-components
@zijianjoy Do you know if Pipelines SDK verifies that all function imports are defined properly in Python Lightweight component?
@anencore94 Any ideas how to verify this?
cc @tenzen-y @gaocegege @johnugeorge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need to automatically validating this after this PR. How about making an issue for tracking this feature now ?
If we could make an idea for this validation in general, we may apply this for other kf components like pipelines and training-operators and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that sound good @anencore94.
|
||
return result | ||
|
||
|
||
def double(min: float, max: float, step: float = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about gather these hyperparameter helper function to another inner module?
For example, the users could call these functions by
import kubeflow.katib as katib
hp_search_space = {
"a": katib.hp_search_space.int(min=10, max=20),
"b": katib.hp_search_space.double(min=0.1, max=0.2)
}
katib_client.tune(
name=name,
objective=objective,
hp_search_space=hp_search_space,
objective_metric_name="result",
max_trial_count=12
)
IMO, importing by katib.int
makes me it's too board that we couldn't see it is for hyperparameter search space type immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was even thinking to change this API in the future to support various distributions: uniform
, loguniform
, categorical
, etc.
FYI, hyperopt uses hp
module, skop uses space
module, optuna uses distributions
module, Ray uses tune
module. I think we can learn something from it.
search
module (e.g. katib.search.int(1,5)
) looks good to me, but I am not sure what is the best UX here.
What others think @terrytangyuan @gaocegege @johnugeorge @c-bata @a9p @g-votte @tenzen-y ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ref #1951 (comment).
IMO it feels more natural if we could support built-in lists or common numpy arrays, etc. Is there a particular motivation for bringing a separate module in the Katib SDK just to do the same thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@terrytangyuan List and numpy arrays might not be enough.
How we can identify search space for uniform
or loguniform
in the future?
e.g. Hyperopt utilises some numpy APIs, but it still has it's own functions for the search space: https://github.com/hyperopt/hyperopt/blob/master/hyperopt/pyll/stochastic.py#L41-L43
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functions in that module seem to be very thin wrappers. If we could completely avoid creating additional wrappers/layers, it would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @andreyvelich.
In this PR, we can start in the minimal feature. So, we can support other distributions in other PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@terrytangyuan Do you have any ideas how we can improve UX on this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong preference. I am ok as is if other projects are following the same convention.
algorithm_name: str = "random", | ||
objective_metric_name: str = None, | ||
additional_metric_names: List[str] = [], | ||
objective_type: str = "maximize", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could use "maximize" and "minimize" as an Enum
class or Literal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking about it.
@anencore94 Is there a way to automatically generate enums for our Python Models ?
In our APIs type
param is enum: https://github.com/kubeflow/katib/blob/master/pkg/apis/controller/common/v1beta1/common_types.go#L96
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "automatically" means that automatically generate python Enum class from the golang file or openapi.json ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "automatically" means that automatically generate python Enum class from the golang file or openapi.json ?
Exactly. I am not sure if kube-openapi
supports enum generation. Check this: kubernetes/enhancements#2887.
Maybe we should update kube-openapi
version for this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no many experience with it sadly. How about make an issue with this feature, and merge this PR with hardcoded for now ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good @anencore94 !
Modify packages_to_install doc Create validate objective function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreyvelich Thanks for your contributions!
I left a few comments.
Change k8s version package
Thanks everyone for review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreyvelich Thanks for updating this!
Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
Thanks for addressing my comments! |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andreyvelich, johnugeorge The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks @andreyvelich for this great feature |
Thanks everyone for your feedback and review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
/lgtm
Fixes: #1585.
Inspired by KFP
create_component_from_func
.This API will allow user to create Katib HP Tuning Experiment without building the Experiment YAML.
This is the first small step to simplify our Kubeflow SDKs and to avoid Kubernetes complexity.
Later we can extend this functionality (support more Katib features in a simple way), give more spec options via APIs.
Also, we can provide API (e.g.
katib.list_base_images
) to show list of supported images (e.g.BASE_IMAGE_TENSORFLOW
,BASE_IMAGE_MXNET
) with their description.cc @kubeflow/wg-training-leads @tenzen-y @anencore94 Please give your feedback on the API design.