-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] Pipelines generated from kfp 2.10 ignore accelerator #11374
Comments
Thanks Trevor. This is now resolved, and we'll do a patch release on 2.10 to pull in these changes. |
Could it be that the same issue applies also to CPU requests and limits? |
yep. Good catch. 83dcf1a Any interest in submitting a fix? You can use https://github.com/kubeflow/pipelines/pull/11373/files as an example. |
I took a brief look at this, unfortunately it seems to be not that simple as bringing back the old fields as well. |
I opened #11390 as a follow-up of the CPU/Memory requests/limits, I'll see if I work on a fix very soon. |
@vanHavel thanks for trying! I appreciate it 😄 |
When executing or compiling a pipeline using the 2.10 kfp sdk with the following configuration:
The pipeline server ignores the gpu option and is scheduled without the gpu in the resource configuration.
This appears to be a breaking change introduced in 2.10
Environment
How do you deploy Kubeflow Pipelines (KFP)?
Red Hat OpenShift AI
KFP version:
KFP SDK version:
Steps to reproduce
acc-test.py
The pods created for the step will not include the
nvidia.com/gpu
in the pod spec resources, and the pod will get scheduled on a non-gpu node.Expected result
The pod should include the resources definition for the GPUs and the pod should be scheduled on a GPU node.
Materials and reference
It looks like the bug was likely introduced in:
#11097
When compiling the pipeline with 2.10 it renders the following:
With older version such as 2.9, it renders the following:
Fix in progress:
#11373
Labels
Impacted by this bug? Give it a 👍.
The text was updated successfully, but these errors were encountered: