Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to use variable replacement when defining resource limits and requests #4080

Open
goldmann opened this issue Jul 8, 2021 · 34 comments
Labels
area/api Indicates an issue or PR that deals with the API. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@goldmann
Copy link

goldmann commented Jul 8, 2021

Feature request

Consider this example Task:

apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
  name: resource-example
spec:
  description: Sample Task to show issues with passing parameters to resources.
  params:
    - name: RESOURCE_MEMORY_REQUEST
      type: string
      default: 1Gi
    - name: RESOURCE_MEMORY_LIMIT
      type: string
      default: 2Gi
  steps:
    - name: noop
      image: registry.access.redhat.com/ubi8/ubi-minimal:8.2
      resources:
        requests:
          memory: $(params.RESOURCE_MEMORY_REQUEST)
        limits:
          memory: $(params.RESOURCE_MEMORY_LIMIT)

When creating such a resource it fails with:

Error from server (BadRequest): error when creating "minimal-resource.yaml": admission webhook "webhook.pipeline.tekton.dev" denied the request: mutation failed: cannot decode incoming new object: quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'

This suggests that webhook is not able to decipher what is the value. And since the value is correct, it should be allowed to do use it this way.

Use case

Main use case is to dynamically change requests and resources based on the runtime requirements. I consider this as a workaround for general lack of a nice way to do it otherwise. But even the workaround doesn't work currently :)

@goldmann goldmann added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 8, 2021
@bobcatfish bobcatfish changed the title Allow to use expressions when defining resource limits and requests Allow to use variable replacement when defining resource limits and requests Jul 8, 2021
@bobcatfish bobcatfish added good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. area/api Indicates an issue or PR that deals with the API. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Jul 8, 2021
@jcatana
Copy link

jcatana commented Jul 22, 2021

This is very useful for buildah or other image building tasks. Clusters with namespace resource quotas MUST have resource limits and requests defined.
Having to define resources statically on the task level can be annoying, I currently need to modify the task templates, and create buildah-small-1GB-ram, buildah-large-30GB-ram type tasks.
Some image builds can be very trivial and small, while others may include large scale code compilation and need much more resources.
It would be ideal to have this functionality.

@vinamra28
Copy link
Member

/assign

@Duologic
Copy link

keywords for search: variable substitution

@wzhanw
Copy link

wzhanw commented Aug 27, 2021

The container.ResourceRequirements.Limits and Requests were defined as map[ResourceName]resource.Quantity, I think it is not easy to set an invalid string value to it and substitute later without changing the api.

@vinamra28
Copy link
Member

The container.ResourceRequirements.Limits and Requests were defined as map[ResourceName]resource.Quantity, I think it is not easy to set an invalid string value to it and substitute later without changing the api.

Yeah I also figured this out. The resource.Quantity has a regex validator ^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$ which doesn't allows invalid string.

@michaelsauter
Copy link

I came across this as I naively hoped to set the resource constraints via parameters. I think this feature is crucial as the place to decide the resource requirements is more likely the pipeline than the task.

I guess the regex validator would need to allow a parameter reference then?

michaelsauter added a commit to opendevstack/ods-pipeline that referenced this issue Aug 30, 2021
WIP as the docs are out of date, and they cannot be updated because
unmarshalling fails Tekton validation as noted in
tektoncd/pipeline#4080.

[ci skip]
@rporres
Copy link

rporres commented Sep 1, 2021

Looks like a duplicate of #1530, which got closed without resolution

@michaelsauter
Copy link

Good point. Seems like #1548 is about the same problem as well. This is close to being a blocker for reusable tasks ...

@bobcatfish Do you think this could be moved up in priority? Also - how can one help here?

@rporres
Copy link

rporres commented Sep 2, 2021

I've had to write automation to create tasks that only differ in the steps resources, so this would be very welcomed. I guess the problem here is that you have to move somewhere else the resources format check as it cannot be anymore in the Task admission webhook.

@concaf
Copy link
Contributor

concaf commented Sep 13, 2021

Also, this produces a UX bug for tkn as well. I am unable to list valid tasks if an invalid task exists in the cluster.

$ kubectl get tasks
NAME               AGE
echo-hello         9m50s
echo-hello-world   11m

$ tkn task ls
Error: failed to list Tasks from namespace default: quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'

CC: @piyush-garg WDYT?

@lbernick
Copy link
Member

/assign lbernick

@lbernick
Copy link
Member

lbernick commented Nov 3, 2021

To summarize what previous commenters have said, and add a little more detail:
The reason variable replacement currently works on most string fields is because there is no validation on the strings, so "$(params.foo)" is a valid value for many Task fields. However, "$(params.foo)" isn't a valid resource quantity, and fails on the ParseQuantity method when the Task JSON is unmarshalled into a Go struct.

I'm working on a PR to address this by replacing Step.Resources with a struct where Requests and Limits are maps of ResourceName -> string rather than ResourceName -> Quantity (thanks @bobcatfish for the suggestion!). Unfortunately this will require updates to the Tekton CLI and maybe dashboard, to update references to Task.Container.

Other ways of doing this could be special casing param names (e.g. param "resources.requests.memory" is used to replace memory requests), the "patches" syntax suggested by @skaegi in #1530, or support specifying resource requests in TaskRuns, as requested in #4326.

lbernick added a commit to lbernick/community that referenced this issue Nov 8, 2021
This TEP addresses tektoncd/pipeline#4080 and tektoncd/pipeline#4326 by
proposing new configuration to TaskRuns and PipelineTaskRuns that can
override any Step resource requirements specified in a Task or PipelineTask.
lbernick added a commit to lbernick/community that referenced this issue Nov 8, 2021
This TEP addresses tektoncd/pipeline#4080 and tektoncd/pipeline#4326 by
proposing new configuration to TaskRuns and PipelineTaskRuns that can
override any Step resource requirements specified in a Task or PipelineTask.
lbernick added a commit to lbernick/community that referenced this issue Nov 8, 2021
This TEP addresses tektoncd/pipeline#4080 and tektoncd/pipeline#4326 by
proposing new configuration to TaskRuns and PipelineTaskRuns that can
override any Step resource requirements specified in a Task or PipelineTask.
lbernick added a commit to lbernick/community that referenced this issue Nov 8, 2021
This TEP addresses tektoncd/pipeline#4080 and tektoncd/pipeline#4326 by
proposing new configuration to TaskRuns and PipelineTaskRuns that can
override any Step resource requirements specified in a Task or PipelineTask.
@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 9, 2022
@rporres
Copy link

rporres commented Jun 14, 2022

/remove-lifecycle stale

See comment #4080 (comment)

@dibyom dibyom removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 15, 2022
@ywluogg
Copy link
Contributor

ywluogg commented Jul 6, 2022

cc @ywluogg

@lbernick
Copy link
Member

lbernick commented Jul 6, 2022

However our two main use cases are still unaddressed:

  1. we want to modify resources from pipelines, not only from pipelineruns and taskruns - in order to store pipeline definition in a Git repo and create pipeline runs connected to that pipeline

@michaelsauter have you checked out the remote pipelines feature? this was added in v0.35.0 and sounds a lot like what you're describing.

  1. we want to specify sidecars in full from pipelines, not just their resources (see Allow sidecars to be specified in Pipeline/PipelineRun #4235) - in order to e.g. add database containers as sidecars to integration test tasks

Is it fair to say that pipeline-level sidecars might be better to track in #4235? I'm not sure I understand how this relates to parameterizing resource requirements.

An update since I last commented on this issue: we've updated our compatibility policy to clarify that we don't consider a change to the Go libraries a "breaking change" unless it also affects the yaml a user can specify. This means this issue can be tackled at any time, i.e. not tied to our v1 release.

@michaelsauter
Copy link

@michaelsauter have you checked out the remote pipelines feature? this was added in v0.35.0 and sounds a lot like what you're describing.

I have seen it! Unfortunately it doesn't quite fit what we do right now. We currently have a custom file in each repo which defines the pipeline, but also has other configuration. The pipeline is not described fully to avoid boilerplate - the user just references some tasks instead of providing a full Pipeline resource. We then read the config file and automatically inject a task in the beginning and one in finally. See https://github.com/opendevstack/ods-pipeline/blob/master/docs/ods-configuration.adoc#pipeline.

Is it fair to say that pipeline-level sidecars might be better to track in #4235? I'm not sure I understand how this relates to parameterizing resource requirements.

Totally, #4235 is the right place to track sidecars. You are right that it is unrelated from a tech/feature perspective to the issue here. However I believe it is related on a conceptual level. I believe both sidecars and resource requirements should be defined by the Pipeline(Run) author, not by the Task author. How I am thinking about it, ideally, I would be able to express this:

tasks:
  - name: build
    taskRef:
      kind: Task
      name: build-go
    sidecars:
      - image: busybox
        name: hello-sidecar
        script: |
          echo 'Hello from sidecar!'
    resources:
      limits:
         cpu: 100m
         memory: 128Mi
      requests:
         cpu: 100m
         memory: 128Mi
    workspaces:
    - name: source
      workspace: shared-workspace

@lbernick
Copy link
Member

lbernick commented Jul 7, 2022

The pipeline is not described fully to avoid boilerplate

Just FYI about the new propagated parameters feature and upcoming propagated workspaces feature which are designed to reduce verbosity.

How I am thinking about it, ideally, I would be able to express this:

tasks:
  - name: build
    taskRef:
      kind: Task
      name: build-go
    sidecars:
      - image: busybox
        name: hello-sidecar
        script: |
          echo 'Hello from sidecar!'
    resources:
      limits:
         cpu: 100m
         memory: 128Mi
      requests:
         cpu: 100m
         memory: 128Mi
    workspaces:
    - name: source
      workspace: shared-workspace

Thanks, this example is really helpful. In Tekton we have a concept of "authoring time" (Task/Pipeline) vs "runtime" (TaskRun/PipelineRun) and I think compute resources are something that should be specified at runtime; however, there is some precedent for specifying "runtime" related fields on Pipeline Tasks, e.g. timeout so this could make sense.

Right now, we're implementing task-level resource requirements (example), and as part of the future work for this proposal one thing we are going to consider is deprecating task.spec.steps.resources, task.spec.stepTemplate.resources, and taskRun.spec.stepOverrides. (I didn't mention it in the TEP but task.spec.sidecars.resources also makes less sense to me than taskRun.spec.sidecarOverrides.resources.)

Here's what I'm picturing for our eventual compute resources api (note that none of this is set in stone, it's just my personal opinion about how things should look and will depend on feedback on task-level compute resources; also I'm hoping we can standardize on "computeResources" rather than "resources"):

kind: TaskRun
spec:
  taskRef:
    name: "my-task"
  computeResources:
     requests:
       cpu: 100m
  sidecarOverrides:
    name: my-sidecar
    computeResources:
      memory: 500Mi

or on a PipelineRun:

spec:
  pipelineRef
    name: my-pipeline
  taskRunSpecs:
  - name: my-task
     computeResources:
        cpu: 100m
     sidecarOverrides:
     - name: my-sidecar
        computeResources:
          memory: 500Mi

It could be totally reasonable to have computeResources under Tasks instead of taskRunSpecs, as you suggest. However that seems like more of a syntax difference than new functionality? To clarify, pipelineRun.spec.taskRunSpecs.stepOverrides and pipelineRun.spec.taskRunSpecs.sidecarOverrides exist already-- the addition is pipelineRun.spec.taskRunSpecs.computeResources.

If we do choose to remove task.spec.steps.resources, task.spec.stepTemplate.resources, and taskRun.spec.stepOverrides, I think there's limited value in implementing parameterization of compute resources. I think the best course of action for right now is to continue working on implementation for taskRun.spec.computeResources, get feedback, and then consider our compute resources API as a whole. WDYT? @dibyom @vdemeester curious to get your thoughts as well.

@michaelsauter
Copy link

In Tekton we have a concept of "authoring time" (Task/Pipeline) vs "runtime" (TaskRun/PipelineRun) and I think compute resources are something that should be specified at runtime

While I generally agree with the distinction, I think that Task and Pipeline could also be seen as being in two different "times". Tasks are the reusable building blocks, therefore they are authored "earlier" then the pipelines, and the pipelines may want to customise the tasks they are using. For me the customisation of compute resources and sidecars is comparable to passing parameters ...

We have been thinking to simply create PipelineRuns instead of Pipelines from our config file, in order to move into the "runtime", but the downside of that is that e.g. grouping of runs in the UI is lost (we're using the OpenShift console).

To clarify, pipelineRun.spec.taskRunSpecs.stepOverrides and pipelineRun.spec.taskRunSpecs.sidecarOverrides exist already-- the addition is pipelineRun.spec.taskRunSpecs.computeResources.

My understanding though is that sidecarOverrides only allows to override the compute resources of a sidecar, while our goal is to define completely new sidecars (e.g. defining image, env vars etc.). The biggest use case for us is running integration tests against a test database. The task is pretty generic and just invokes the test framework, but each repo may use a slightly different database setup - so ideally it could define the sidecar. And similarly, it would be great to define compute resources because the tests may consume a different amount of e.g. memory, depending on the repo.

@lbernick
Copy link
Member

lbernick commented Jul 8, 2022

Thanks Michael for continuing to help me understand what you're looking for here! I think what you're requesting is actually a bit different from what the original author of this issue was requesting, so I opened #5110 to track your FR. PTAL and feel free to add any details I missed. That along with #4235 should cover your FRs IMO, so I'm going to close this issue, but feel free to reopen if you're specifically looking for parameterization of resource requirements.

@lbernick lbernick closed this as completed Jul 8, 2022
Repository owner moved this from In Progress to Done in Pipelines V1 Jul 8, 2022
@JeromeJu JeromeJu removed their assignment Jul 11, 2022
@leoffj
Copy link

leoffj commented Aug 2, 2023

I would actually love to see this issue reopened.
Quick description of our use-case:
We have general purpose pipelines used by multiple teams on different repositories: building images from docker files, building .net apps, building frontends with npm, ...
Depending on the repository the compile/build step can have very different resource requirements.
We already allow teams to write some other configuration for their project into a file in the repository. If we could override the resource requirements and limits via substitution from a different task's result, we could enable teams to specify individual resource requirements per project and still run the same pipeline.

@mike-sirs
Copy link

I would actually love to see this issue reopened. Quick description of our use-case: We have general purpose pipelines used by multiple teams on different repositories: building images from docker files, building .net apps, building frontends with npm, ... Depending on the repository the compile/build step can have very different resource requirements. We already allow teams to write some other configuration for their project into a file in the repository. If we could override the resource requirements and limits via substitution from a different task's result, we could enable teams to specify individual resource requirements per project and still run the same pipeline.

I guess you can just use stepOverride

@lbernick
Copy link
Member

lbernick commented Aug 2, 2023

@leoffj Could you let me know if taskRun.spec.stepSpecs.computeResources or taskRun.spec.computeResources address your use case? #5489 and #5490 have been created to collect feedback on these features.

@leoffj
Copy link

leoffj commented Aug 7, 2023

I am not really sure how that works.. At which point do you specify the stepOverride and when is it evaluated?
Currently we simply create the pipeline run with a trigger template. If I define the taskRun.spec.computeResources there as a variable, doesn't the substitution happen immediately? My values are only available after cloning the repository and reading them from the file.

Or were you thinking along the lines of just starting a task run from the trigger template, which then fetches the file from our git server and then creates a pipeline run with the proper parameters set?

@lbernick
Copy link
Member

lbernick commented Aug 7, 2023

I was thinking you could specify the pipelinerun in your triggertemplate like this:

apiVersion: tekton.dev/v1
kind: PipelineRun
spec:
  taskRunSpecs:  # taskRunTemplate could also be used here
  - pipelineTaskName: expensive-build-step
     stepSpecs:
     - name: build
        computeResources:
           ...

To clarify, do you store compute resource configuration by itself in the repo, and want your PipelineRun to read that info in one TaskRun and spawn subsequent TaskRuns with the right compute resources? Or are you interested in specifying compute resource configuration on Tekton resources stored in a repo?

I'm not sure how we'd support the former feature but it's not currently supported. (There may be some hacks like having your taskrun call the k8s api to create other taskruns, but this obviously isn't a great ux.) For the latter feature, you might be interested in remote resolution in Triggers (tektoncd/triggers#1539).

@leoffj
Copy link

leoffj commented Aug 8, 2023

We have the first use case. We set it up such that pipelines and tasks are in one repository, and lots of different projects use these pipelines. These projects are e.g. dotnet backends, and have their own repository. We already allow the project teams to place a configuration file in their repositories, which we then evaluate during the pipelinerun (with a task simply reading values from this file, which was previously fetched via git clone) and store them as results of the task. Follow up tasks then reference these results e.g. for the image that should be used, or to determine if certain other steps should be taken.
The idea was, to allow teams to write the compute resources as another parameter to this file. Sadly this does not work out as resource limits and requests do not allow replacement.
Essentially, what I would need is something like

apiVersion: tekton.dev/v1
kind: PipelineRun
spec:
  taskRunSpecs:  # taskRunTemplate could also be used here
  - pipelineTaskName: expensive-build-step
     stepSpecs:
     - name: build
        computeResources:
           requests:
             memory: $(tasks.get-params.results.requests_memory)
             cpu: $(tasks.get-params.results.requests_cpu)
           limits:
             memory: $(tasks.get-params.results.limits_memory)
             cpu: $(tasks.get-params.results.limits_cpu)

but (without having tested it) I think this wont work as those results will only be around after the get-params task was executed.

@lbernick
Copy link
Member

lbernick commented Aug 8, 2023

@leoffj that's correct. It sounds like this is a use case for parameterizing resource requirements that's not met by existing/substitute features, so I'll reopen this issue. Thanks!

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 6, 2023
@vdemeester
Copy link
Member

/lifecycle frozen

@tekton-robot tekton-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 7, 2023
@alanmoment
Copy link

Really, Really need this function, please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue or PR that deals with the API. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
Status: Done
Development

No branches or pull requests