Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Workspace Bindings #3435

Closed
ghost opened this issue Oct 22, 2020 · 8 comments
Closed

Custom Workspace Bindings #3435

ghost opened this issue Oct 22, 2020 · 8 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@ghost
Copy link

ghost commented Oct 22, 2020

Feature request

Allow flexible, configurable support for Workspace Bindings with currently unsupported volume types, provisioning styles, and storage that doesn't manifest as k8s Volumes such as cloud storage buckets.

Use case

As a Tekton operator I want to utilize a kind of storage that Tekton doesn't currently support so that I can integrate with whatever my organization already uses for storage.

Concrete examples include:

  • Use separate cloud storage buckets for each Run that executes in my cluster
  • Use a single cloud storage bucket per-Team or per-Organization
  • Use Kubernetes volume types that Tekton doesn't bake in support for today (e.g. flexVolume (requested))
  • Utilize storage provisioning techniques that neither the platform nor Tekton currently provide such as Persistent Volume / Volume Claim pooling (see original issue)
  • Use proprietary, in-house storage solutions

This list is non-exhaustive. The idea is to present an interface via which an organization can support essentially any storage mechanism they want.

Caveats

Whatever this feature ends up as should be configurable by platform owners. If a platform wants to only support a subset of custom bindings (or doesn't want to support the feature at all) then they should be able to configure it as such.

Pseudo-code and hand-waving for what this might look like

Random Directory on a Bucket

Imagine you've got a GCS bucket set up that your org shares across multiple teams for their CI/CD workloads to drop data on to. The bucket's configured with a 10-day retention period and each CI/CD workload gets a random directory in the bucket. In the following example a PipelineRun author configures a workspace binding to use a randomized directory on that GCS bucket:

workspaces:
- name: output-workspace
  plugin:
    class: GCSBucketRandomDirectory
    params:
    - name: random-seed
      value: context.taskRun.uid

What happens next? There are a lot of possible approaches. Here's one option:

Prior to running the Pipeline Tekton looks up the class GCSBucketRandomDirectory in its registry of Custom Workspace Providers. The Tekton controller sends a request to the HTTP server registered for the GCSBucketRandomDirectory class. The HTTP server responds with some Volume configuration and Steps to inject in to all of that Pipeline's TaskRuns. The Volume config is emptyDir so each TaskRun Pod effectively starts with an extra empty volume. gcs-download and gcs-upload Steps are injected before and after every TaskRun Task to populate the emptyDir from the random bucket directory and to upload from the emptyDir to the random bucket directory.

In-House Storage Solution

Your company uses an in-house storage solution and exposes an API that teams use to request and release chunks of persistent storage. Your platform team is responsible for CI/CD and needs to use the in-house storage API to make storage available to your various application teams in their pipelines. You write an HTTP server that talks to the in-house storage API to provision and tear down storage for Pipelines and you expose that functionality via a Custom Workspace Binding plugin. Teams then use it like this in their PipelineRuns:

workspaces:
- name: output-workspace
  plugin:
    class: CompanyXStorageProvider
    params:
    - name: region
      value: xxxxx
    - name: performance-class
      value: ssd
    # ... etc ... more params here

These params are then bundled up in to an HTTP request and sent by the Tekton Controller to your server (registered against the CompanyXStorageProvider class) when the PipelineRun starts up. Your server in turn coordinates with the in-house storage API to figure out the nuts-and-bolts. Your server responds with whatever Volume, Step-injection, Sidecar configuration, ConfigMaps, Secrets etc... is needed for the Tasks in the Pipeline to correctly access the storage.

One further wrinkle: your HTTP server needs to know when the PipelineRun is done with the piece of storage you exposed so you can tear it down. As part of your initial response payload you return a notification webhook endpoint that should be hit when the PipelineRun is complete, along with a token for the piece of storage that's been claimed. The PipelineRun reconciler hits that endpoint with the token to notify your server that it can now safely release that portion back to the in-house storage API.

Related Work

@ghost ghost added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 22, 2020
@bioball
Copy link
Contributor

bioball commented Oct 22, 2020

Another really cool use-case that I think this addresses is to express artifact publishing as a workspace. For instance: "write your library to this workspace, and it will be published to Artifactory when your build is complete"

This is really useful for platform teams that might want to control where and how publishing happens. For instance, they might not want to expose credentials or signing keys to the users of the platform.

@ghost
Copy link
Author

ghost commented Nov 4, 2020

In #3467 a user is requesting NFS support in Tekton Workspace Bindings. I've tested this locally against Filestore. It works but it's more labor intensive than it ideally should be because a new PersistentVolume has to be created for every single TaskRun. nfs volumes do not support dynamic provisioning out of the box with kubernetes.

There are instructions available for using a third-party provisioner which requires installing helm and deploying a new service. A potential use-case for an nfs Custom Workspace Bindings could be to create an nfs PV for each Run but I wonder if this is basically just replicating the existing third-party provisioner and this issue should be left as a problem for the more general k8s ecosystem.

@ghost
Copy link
Author

ghost commented Dec 14, 2020

Further to my previous comment, a lot of the use cases that Custom Workspace Bindings might end up providing are already covered by the CSI spec: https://github.com/container-storage-interface/spec/blob/master/spec.md

It doesn't seem to me like reinventing or duplicating some of the CSI spec in Tekton is a great way to go here.

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 14, 2021
@ghost
Copy link
Author

ghost commented Mar 15, 2021

I captured the problem statement from this issue in TEP-0038 but have punted on proposing an implementation or solution at the moment.

/remove-lifecycle stale

@tekton-robot tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 15, 2021
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 13, 2021
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 13, 2021
@ghost
Copy link
Author

ghost commented Jul 13, 2021

I think the core of this might be implementable via the pipeline-in-a-pod feature that's under development. For example: composing a test task with a gcs-upload task in a pipeline could be rendered to steps in a single pod with an emptyDir between them.

In conjunction with the CSI spec, existing persistent volume claim support, and custom tasks / controllers, I think there are a lot of ways already to manage custom persistence stories. I'm going to close this issue for the time being.

@ghost ghost closed this as completed Jul 13, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

2 participants