Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Kubeflow VolumeOp (PVC) not delete after Pipeline execution. #6649

Closed
peterhaddad3121 opened this issue Sep 30, 2021 · 7 comments
Closed
Labels
kind/bug lifecycle/stale The issue / pull request is stale, any activities remove this label.

Comments

@peterhaddad3121
Copy link

What steps did you take

The first step of a Pipeline is creating a VolumeOp to be attached to a couple of KF components in a Pipeline.
After the pipeline finishes and all Component pods are Completed then deleted, the PVC created does not delete.

What happened:

I edited the deployment of ml-pipeline-persistenceagent to include:

   - name: TTL_SECONDS_AFTER_WORKFLOW_FINISH
      value: "60"

What did you expect to happen:

The PVC of the VolumeOp created to be deleted. All pods are deleted after pipeline completion, the PVC still exists and no deletion timestamp exists.

Environment:

  • How do you deploy Kubeflow Pipelines (KFP)?
  • KFP version:
  • KFP SDK version:

Anything else you would like to add:

I saw all the corresponding issues, but this is related to using ArgoCD. This is using the Python SDK to run and create PIpelines.

Labels


Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@peterhaddad3121 peterhaddad3121 changed the title [bug] <Bug Name> [bug] Kubeflow VolumeOp (PVC) not delete after Pipeline execution. Sep 30, 2021
@zijianjoy
Copy link
Collaborator

Possible related: #3938

@elikatsis
Copy link
Member

Hi @peterhaddad3121 ,

to GC the resources created by ResourceOps you should set the set_owner_reference argument to True. It's this one:

set_owner_reference: bool = None,

This essentially ties the CR to the workflow and makes it to get deleted once the agent GCs the workflow.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Oct 13, 2021

Hi @peterhaddad3121 ,

to GC the resources created by ResourceOps you should set the set_owner_reference argument to True. It's this one:

set_owner_reference: bool = None,

This essentially ties the CR to the workflow and makes it to get deleted once the agent GCs the workflow.

there is something wrong with RBAC

cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on

@juliusvonkohout
Copy link
Member

@elikatsis @zijianjoy

set_owner_resources needs #6622 to set the finalizers for blockowner deletion

    resource_component = kfp.dsl.VolumeOp(
        name = "create-pvc",
        resource_name = 'pvc', # name="{{workflow.name}}-%s" % resource_name
        modes = kfp.dsl.VOLUME_MODE_RWO,
        size = '1Gi',
        #storage_class = 'ultra-high',
        set_owner_reference = True # https://github.com/kubeflow/pipelines/issues/6649#issuecomment-938509228
    )

or you get something like this

Error from server (Forbidden): error when creating \"/tmp/manifest.yaml\": persistentvolumeclaims \"big-data-pipeline-vm2ps-artefacts\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: ,

google-oss-robot pushed a commit that referenced this issue Nov 2, 2021
… (#6622)

* Update view-edit-cluster-roles.yaml

* Update view-edit-cluster-roles.yaml
@juliusvonkohout
Copy link
Member

Please close, since with #6622 merged you can set the owner references properly.

@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Mar 2, 2022
abaland pushed a commit to abaland/pipelines that referenced this issue May 29, 2022
…eflow#6649 (kubeflow#6622)

* Update view-edit-cluster-roles.yaml

* Update view-edit-cluster-roles.yaml
Copy link

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug lifecycle/stale The issue / pull request is stale, any activities remove this label.
Projects
No open projects
Status: Closed
Development

No branches or pull requests

4 participants