-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KatibClient().list_experiments/list_trials fails with RuntimeError #2110
Comments
/cc @andreyvelich |
It seems that if there are kubernetes.client classes used in any experiment/trial the deserialization fails. This rather quick fix tries to get classes from `kubernetes.client` as well for deserialization. Addresses: kubeflow#2110
I guess the issue is that I have a Not sure if this is common enough to need to be supported. |
@votti Please can you show your Experiment ? How it can contain |
We can try to import all Kubernetes modules to Katib SDK, similar to Training Operator: https://github.com/kubeflow/training-operator/blob/master/sdk/python/kubeflow/training/models/__init__.py#L16-L17. |
I'm not sure if we must import all K8s modules to Katib SDK since our CI works well. |
Maybe what would be an improvement is if there would be a warning if some experiments/trials cannot be loaded (for whatever reason) and still returning the experiments that are Having an invalid/strange experiment/trial in a namespace breaking the |
I think this is an offending Experiment: I suspect this is because I am using an |
I see. It might be better to add E2E for Argo workflow integration. |
No, Trial spec is just an object: https://github.com/kubeflow/katib/blob/master/sdk/python/v1beta1/kubeflow/katib/models/v1beta1_trial_template.py#L43 It fails because of Custom Collector has Kubernetes Container APIs. |
I think, once we import all Kubernetes modules we can always deserialise Experiment. WDYT @tenzen-y @votti ? |
Also a good option. But in the end both seems fine. |
@votti I guess, the problem is that We can track better solution in this issue: kubeflow/training-operator#1723. WDYT ? |
/kind bug
What steps did you take and what happened:
I tried to list the experiments/trials using the Katib Client:
raises the error
What did you expect to happen:
No error.
Anything else you would like to add:
I was able to temporarily fix this by patching the
kubeflow.katib.models
namespace:Environment:
Impacted by this bug? Give it a 👍 We prioritize the issues with the most 👍
The text was updated successfully, but these errors were encountered: