Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workflow-controller panic when Stop\Terminate a workflow using a plugin #9587

Closed
5 tasks done
xlgao-zju opened this issue Sep 14, 2022 · 2 comments · Fixed by #9690
Closed
5 tasks done

workflow-controller panic when Stop\Terminate a workflow using a plugin #9587

xlgao-zju opened this issue Sep 14, 2022 · 2 comments · Fixed by #9690
Labels

Comments

@xlgao-zju
Copy link
Contributor

Checklist

  • Double-checked my configuration.
  • Tested using :latest images.
  • Attached the smallest workflow that reproduces the issue.
  • Attached logs from the workflow controller.
  • Attached logs from the wait container.

Summary

What happened/what you expected to happen?
workflow-controller panic when Stop\Terminate a workflow using a plugin

What version are you running?
v3.3.9

Diagnostics

Paste the smallest workflow that reproduces the bug. We must be able to run the workflow.
the template yaml is like this

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: hello-template
  namespace: argo
spec:
  serviceAccountName: chaosmesh-executor-plugin
  automountServiceAccountToken: true
  entrypoint: main
  templates:
    - name: main
      executor:
        serviceAccountName: chaosmesh-executor-plugin
      plugin:
        chaosMesh:
          order: 1
          duration: 60000
          taskUid: '{{workflow.parameters.task-uid}}'
          pods:
            - name: nginx
              namespace: argo
          parentUid: '{{workflow.parameters.parent-uid}}'
          chaosKind: StressChaos
          chaosBody:
            kind: StressChaos
            apiVersion: chaos-mesh.org/v1alpha1
            metadata:
              namespace: argo
              name: '{{workflow.parameters.task-uid}}'
            spec:
              selector:
                namespaces:
                  - argo
                labelSelectors:
                  run: nginx
              mode: all
              stressors:
                cpu:
                  workers: 1
                  load: 50

the workflow yaml is like this:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-
  namespace: argo
spec:
  serviceAccountName: chaosmesh-executor-plugin
  automountServiceAccountToken: true
  entrypoint: main
  arguments:
    parameters:
      - name: task-uid
        value: foo
      - name: parent-uid
        value: bar
  templates:
    - name: main
      steps:
      - - name: main
          templateRef:
            name: test666
            template: main
# Logs from the workflow controller:
kubectl logs -n argo deploy/workflow-controller | grep ${workflow} 

time="2022-09-14T06:55:52.267Z" level=info msg="Processing workflow" namespace=argo workflow=test
time="2022-09-14T06:55:52.267Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=test
time="2022-09-14T06:55:52.267Z" level=info msg=updateAgentPodStatus namespace=argo workflow=test
time="2022-09-14T06:55:52.267Z" level=info msg=assessAgentPodStatus namespace=argo podName=test-1340600742-agent
time="2022-09-14T06:55:52.267Z" level=info msg="Terminating pod as part of workflow shutdown" namespace=argo podName=test-1340600742-agent shutdownStrategy=Terminate workflow=test
time="2022-09-14T06:55:52.268Z" level=debug msg="Evaluating node test: template: *v1alpha1.WorkflowStep (main), boundaryID: " namespace=argo workflow=test
time="2022-09-14T06:55:52.268Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo,name=test)" depth=0 tmpl="*v1alpha1.WorkflowStep (main)"
time="2022-09-14T06:55:52.268Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo,name=test)" depth=0 tmpl="*v1alpha1.WorkflowStep (main)"
time="2022-09-14T06:55:52.268Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo,name=test)" depth=0 tmpl="*v1alpha1.WorkflowStep (main)"
time="2022-09-14T06:55:52.268Z" level=debug msg="Executing node test of Steps is Running" namespace=argo workflow=test
time="2022-09-14T06:55:52.268Z" level=debug msg="Evaluating node test[0].main: template: *v1alpha1.WorkflowStep (test666/main#false), boundaryID: test" namespace=argo workflow=test
time="2022-09-14T06:55:52.268Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo,name=test)" depth=0 tmpl="*v1alpha1.WorkflowStep (test666/main#false)"
time="2022-09-14T06:55:52.268Z" level=debug msg="Found stored template" base="*v1alpha1.Workflow (namespace=argo,name=test)" depth=0 tmpl="*v1alpha1.WorkflowStep (test666/main#false)"
panic: workflow 'test' node '' uninitialized when marking as Failed: [workflow shutdown with strategy: Terminate]

goroutine 312 [running]:
github.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).markNodePhase(0xc0008ec160, {0x0, 0x0}, {0x214a9a6, 0x6}, {0xc000be6650?, 0x1, 0x1})
/go/src/github.com/argoproj/argo-workflows/workflow/controller/operator.go:2246 +0xaba
github.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).handleExecutionControlError(0xc0008ec160, {0xc000918310, 0xf}, 0xc0000e2240, {0xc0001561e0, 0x2b})
/go/src/github.com/argoproj/argo-workflows/workflow/controller/exec_control.go:78 +0x187
github.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).applyExecutionControl(0xc0008ec160, 0xc0007a2000, 0x0?)
/go/src/github.com/argoproj/argo-workflows/workflow/controller/exec_control.go:44 +0x612
github.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).podReconciliation.func2(0x0?)
/go/src/github.com/argoproj/argo-workflows/workflow/controller/operator.go:1033 +0x97
created by github.com/argoproj/argo-workflows/v3/workflow/controller.(*wfOperationCtx).podReconciliation
/go/src/github.com/argoproj/argo-workflows/workflow/controller/operator.go:1030 +0x215

# Logs from in your workflow's wait container, something like:
kubectl logs -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@xlgao-zju
Copy link
Contributor Author

@chenyangxueHDU
Copy link
Contributor

+1

chenyangxueHDU pushed a commit to chenyangxueHDU/argo that referenced this issue Sep 29, 2022
…oproj#9587

Signed-off-by: yangxue.chen <chenyangxuehdu@126.com>
chenyangxueHDU pushed a commit to chenyangxueHDU/argo that referenced this issue Sep 29, 2022
…oproj#9587

Signed-off-by: yangxue.chen <chenyangxuehdu@126.com>
alexec pushed a commit that referenced this issue Sep 29, 2022
… (#9690)

Signed-off-by: yangxue.chen <chenyangxuehdu@126.com>

Signed-off-by: yangxue.chen <chenyangxuehdu@126.com>
Co-authored-by: yangxue.chen <chenyangxuehdu@126.com>
juchaosong pushed a commit to juchaosong/argo-workflows that referenced this issue Nov 3, 2022
…oproj#9587 (argoproj#9690)

Signed-off-by: yangxue.chen <chenyangxuehdu@126.com>

Signed-off-by: yangxue.chen <chenyangxuehdu@126.com>
Co-authored-by: yangxue.chen <chenyangxuehdu@126.com>
Signed-off-by: juchao <juchao@coscene.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants