Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple PVCs in combination with affinity assistant #4503

Closed
flo-02-mu opened this issue Jan 21, 2022 · 9 comments
Closed

Allow multiple PVCs in combination with affinity assistant #4503

flo-02-mu opened this issue Jan 21, 2022 · 9 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@flo-02-mu
Copy link

Feature request

It would be nice to be able to use more than one PVC in a pipeline and still use the affinity assistant to ensure all pods are scheduled to the same node.

Use case

In our pipelines we have three or more tasks running at the beginning of the pipeline in parallel that are making use of a workspace, e.g.:

  • Checkout the workspace
  • Download cache from storage account
  • Download deployment repo

With the current restriction of only having one PVC (we are using different subPaths) per pipelinerun (with affinity assistant enabled) we are seeing the following behavior:
In the above example 4 pods (the three from the task + the affinity assistant) are sending the volume attach request. Depending on the timing, this either works fine or it results in all 3 task pods being stuck in the init phase. The pipeline recovers, but this results in task execution times of up to 2min for a simple git checkout task, while the best case is at ~20s.

Although this sounds more like a k8s issue (we are running on Azure Kubernetes Service 1.21 with new CSI driver), the behavior looks better when using one PVC per workspace (we tried with affinity assistant disabled). Running the pipelines with the affinity assistant disabled is not an option, since the time loss for possible volume re-attachements due to node switching is too high.

For the above reason we wonder if the restriction of not having more than 1 PVC together with affinity assistant can either be removed or be disabled by a feature flag.
Additional question: Couldn't the behavior of attaching the volume to the affinity assistant be removed as well? The first task using a volume triggers the attachment to the node and since all subsequent pods are scheduled to the same node, they should be fine?

@flo-02-mu flo-02-mu added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 21, 2022
@flo-02-mu flo-02-mu changed the title Allow multiple PVCs in combination with affinity assistang Allow multiple PVCs in combination with affinity assistant Jan 21, 2022
@jlpettersson
Copy link
Member

jlpettersson commented Jan 21, 2022

Thanks for the feature request @flo-02-mu.

In your use case with e.g. "code checkout" and "download cache", it sounds like you want to have a Task in your pipeline with access to both workspaces, if I understand your case correct?

If those Tasks would use two different PVCs with different Affinity Assistans, they will likely end up on different Nodes (unless you run a single node cluster) - and in such case, the pipeline would be deadlocked.

Example Pipeline with 3 Tasks:

(code checkout) ----\
                      (use both workspaces)
(download cache) ---/

The Task (use both workspaces) can only run if the PVCs that it is using are mounted on the same Node.

@flo-02-mu
Copy link
Author

Yes, the situation is as you describe it (one task using both volumes from the previous tasks). My assumption was that there would be just one affinity assistant with both PVCs because they belong to the same pipelinerun? Or is it TaskRun based and the first task that uses the workspace instantiates the affinity assistant?

@jlpettersson
Copy link
Member

My assumption was that there would be just one affinity assistant with both PVCs because they belong to the same pipelinerun?

Ah, now I understand. Yes, that is a fair proposal.

A feature request with similar goals, but different suggestion would be #3440

Additional question: Couldn't the behavior of attaching the volume to the affinity assistant be removed as well?

That was how it was first implemented, but it did not work when storage class had volumeBindingMode: Immediate and the cluster had multiple AZs, the volume and the Task pod could end up on different AZs and be deadlocked. See e.g. #2630 (comment)

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 25, 2022
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 25, 2022
@dibyom dibyom removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jun 17, 2022
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 15, 2022
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 15, 2022
@pritidesai pritidesai added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Nov 1, 2022
@lbernick
Copy link
Member

@flo-02-mu we've implemented a new alpha feature (per-pipelinerun affinity assistant) that I think should address your use case. Please see these docs for more detail. I'm going to close out this issue so that feedback on this work is only tracked in one place, but we'd still love your feedback! Please feel free to weigh in on #6990, or reopen this issue if your use case is not addressed.

@nanoq66
Copy link

nanoq66 commented Sep 18, 2023

I'm a colleague of @flo-02-mu and I'll try the new feature in the coming days and will report back! THX!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
Status: Done
Development

No branches or pull requests

7 participants