Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Setting 'suspend' parameter in Argo Cron workflow based on overlay #3449

Closed
Step2Web opened this issue Jan 13, 2021 · 8 comments
Closed
Assignees
Labels
area/api issues for api module kind/bug Categorizes issue or PR as related to a bug. kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@Step2Web
Copy link

Hey guys,

We're using Argo Cron Workflows for our daily load pipeline into our data warehouse. We have several environments which are configured via overlays and we have environments like prod, dev, uat, etc.

Argo Cron Workflows have a field called suspend, which determines if a workflow should be started based on provided cron schedule. I would like to suspend the cron workflow based on the environemnt (only have it execute in prod). Currently, we just disabled the deployment but I think the suspend paramter is a better solution.

What I was thinking is to overwrite the suspend (true / false) parameter in the overlay like this:

Argo Cron Workflow

apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
  name: cron-core-load-workflow
spec:
  schedule: "45 2 * * *"
  timezone: "Europe/Vienna"
  concurrencyPolicy: Forbid
  suspend: $(SUSPENDED)
  workflowMetadata:
    labels:
      workflowName: core-load-workflow
  workflowSpec:
    workflowTemplateRef:
      name: core-load-pipeline
    arguments:
      parameters:
        - name: dbt_target
          value: $(DBT_TARGET)`
....

My kustomization file looks like this:

kind: Kustomization

resources:
 - ....

configurations:
  - kustomization-config.yaml

configMapGenerator:
  - name: kustomize-vars
    envs:
      - vars.env

vars:
  - name: DBT_TARGET
    objref: &config-map-ref
      kind: ConfigMap
      name: kustomize-vars
      apiVersion: v1
    fieldref:
      fieldpath: data.DBT_TARGET
...
  - name: SUSPENDED
    objref: *config-map-ref
    fieldref:
      fieldpath: data.SUSPENDED

kustomization-config.yaml looks like this

nameReference:
  - kind: ConfigMap
    version: v1
    fieldSpecs:
      - kind: CronWorkflow
        version: v1alpha1
        path: spec/workflowSpec/arguments/parameters/value

varReference:
  - path: spec/workflowSpec/arguments/parameters/value
    kind: CronWorkflow
    apiVersion: argoproj.io/v1alpha1

  - path: spec
    kind: CronWorkflow
    apiVersion: argoproj.io/v1alpha1

in overlays/dev/vars.env I have:

DBT_TARGET=dev
SUSPENDED=True

However, when I execute kustomize build overlays/dev (output below), the True parameter is wrapped in quotes, which breaks parsing of the workflow definition in argo:

kustomize build overlays/dev 

apiVersion: v1
data:
  DBT_TARGET: dev
  SUSPENDED: true
kind: ConfigMap
metadata:
  labels:
    app: core-load-workflow-dev
  name: kustomize-vars-mhfh92dk78
---
apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
  labels:
    app: core-load-workflow-dev
  name: cron-core-load-workflow
spec:
  concurrencyPolicy: Forbid
  schedule: 45 2 * * *
  suspend: "false"
  timezone: Europe/Vienna
  workflowMetadata:
    labels:
      workflowName: core-load-workflow
  workflowSpec:
    arguments:
      parameters:
      - name: dbt_target
        value: dev

When I then try to use kubectl to apply the kustomized template, argo shows the following error:

argo cron delete clean-test-databases-cron-workflow
FATA[0004] v1alpha1.CronWorkflow.Spec: v1alpha1.CronWorkflowSpec.Suspend: ReadBool: expect t or f, but found ", error found in #10 byte of ...|suspend":"true","tim|..., bigger context ...|y":"Forbid","schedule":"00 08 * * 1-5","suspend":"true","timezone":"Europe/Vienna","workflowMetadata|...

Without the suspend parameter, it works fine.

I'd appreciate any ideas how I could solve this. I could probably just overwrite the cron schedule to something that will never execute but I feel like this is a very hacky solution.

Thanks!

@Shell32-Natsu
Copy link
Contributor

We have some quote issues in 3.9.x version #3412. Can you try 3.8.x version and see if the issue still happens?

@Shell32-Natsu Shell32-Natsu added area/api issues for api module kind/support Categorizes issue or PR as a support question. triage/needs-information Indicates an issue needs more information in order to work on it. labels Jan 13, 2021
@Step2Web
Copy link
Author

Thanks for your quick response! I actually tried it with two different versions:

kustomize version
{Version:kustomize/v3.8.4 GitCommit:8285af8cf11c0b202be533e02b88e114ad61c1a9 BuildDate:2020-09-19T15:39:21Z GoOs:linux GoArch:amd64}


kustomize version
{Version:kustomize/v3.9.1 GitCommit:7439f1809e5ccd4677ed52be7f98f2ad75122a93 BuildDate:2020-12-29T19:25:35Z GoOs:linux GoArch:amd64}

Same result in both versions.

@Shell32-Natsu Shell32-Natsu added kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed triage/needs-information Indicates an issue needs more information in order to work on it. labels Jan 13, 2021
@monopole monopole self-assigned this Jan 16, 2021
monopole added a commit that referenced this issue Jan 17, 2021
@monopole
Copy link
Contributor

monopole commented Jan 17, 2021

Still open. See test in #3472 for a possible workaround (source the var from something other than a configmap, as that can only hold strings).

A code fix could be, when doing a var replacement, see if the origin was a configmap, and if so, be skeptical of the string type.

If the target of the replacement is not a string type, then attempt to parse the replacement value as that type (e.g. if the target field has a bool tag, then try to parse the incoming string ("True", "Yes", etc.) as a true.

The trouble is that the target field is probably tagged as a string, even if it should be a boolean, because it holds, temporarily, a string with a dollar variable. In the case of the test in #3472, the field is

spec:
  suspend: $(SUSPENDED)

This has the tag str, not bool, by the time the var replacer is making its pass.

Vars (dollar variables) have always been a problem-causing hack in kustomize and they need to be deprecated (#2052).

We need a ReplacementTransformer (started here) that replaces a bool for a bool, an int for an int, etc., not needing these string targets sprinkled in the yaml.

@Step2Web
Copy link
Author

Thank you for the explanation! Yeah, I agree, just guessing the type of the value isn't ideal either and will likely cause other issues. I'm not very familiar with kustomize but my naive solution would have been making types explicit in the configmap, something like this:

  - name: SUSPENDED
    type: bool
    objref: *config-map-ref
    fieldref:
      fieldpath: data.SUSPENDED

No idea if this causes other issues or if this is a major change in how ConfigMaps work.

But your suggestion of a ReplacementTransformer looks interesting too, I will have a look into it!

I also haven't had time to try your suggested workaround but I will report back as soon as I get to it. Thank you again for your help here!

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 21, 2021
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 21, 2021
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api issues for api module kind/bug Categorizes issue or PR as related to a bug. kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants