Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot view logs in UI for multi-template workflows #6865

Closed
sampierson opened this issue Oct 5, 2021 · 7 comments · Fixed by #6925
Closed

Cannot view logs in UI for multi-template workflows #6865

sampierson opened this issue Oct 5, 2021 · 7 comments · Fixed by #6925
Assignees
Labels
Milestone

Comments

@sampierson
Copy link

Summary

When attempting to view the log for a workflow with multiple templates, you are presented with a blank log.

What happened/what you expected to happen?

If I run the "Quick Start" workflow at https://raw.githubusercontent.com/argoproj/argo-workflows/master/examples/hello-world.yaml, I can view the container log through the argo UI.

However if I use a workflow with multiple templates, like the first example from https://argoproj.github.io/argo-workflows/workflow-templates/, when I attempt to view the log for the "hello" step, I get a blank log.

Digging into the browser console I see that for workflow steps-grsvt node hello the UI is attempting to GET the log from https://<hostname>/api/v1/workflows/argo/steps-grsvt/log?logOptions.container=main&grep=&logOptions.follow=true&podName=steps-grsvt-1882255119. Experimenting with curl on the command line, I can retrieve the log if I change the podName in the URL from steps-grsvt-1882255119 to steps-grsvt-whalesay-1882255119:

curl -H "Authorization: "`argo auth token` "https://$HOSTNAME/api/v1/workflows/argo/steps-grsvt/log?logOptions.container=main&grep=&logOptions.follow=true&podName=steps-grsvt-whalesay-1882255119"
{"result":{"content":" ________ ","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"\u003c hello1 \u003e","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":" -------- ","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"    \\","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"     \\","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"      \\     ","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"                    ##        .            ","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"              ## ## ##       ==            ","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"           ## ## ## ##      ===            ","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"       /\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"___/ ===        ","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"  ~~~ {~~ ~~~~ ~~~ ~~~~ ~~ ~ /  ===- ~~~   ","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"       \\______ o          __/            ","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"        \\    \\        __/             ","podName":"steps-grsvt-whalesay-1882255119"}}
{"result":{"content":"          \\____\\______/   ","podName":"steps-grsvt-whalesay-1882255119"}}

What version of Argo Workflows are you running?

3.1.12

Diagnostics

Either a workflow that reproduces the bug, or paste you whole workflow YAML, including status, something like:

kubectl get workflow steps-grsvt -o yaml

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  creationTimestamp: "2021-10-04T23:10:12Z"
  generateName: steps-
  generation: 3
  labels:
    workflows.argoproj.io/completed: "true"
    workflows.argoproj.io/phase: Succeeded
  name: steps-grsvt
  namespace: argo
  resourceVersion: "147693045"
  selfLink: /apis/argoproj.io/v1alpha1/namespaces/argo/workflows/steps-grsvt
  uid: a701b4e3-2f8a-455d-a5a6-a2d1419afbcd
spec:
  arguments: {}
  entrypoint: hello
  templates:
  - inputs: {}
    metadata: {}
    name: hello
    outputs: {}
    steps:
    - - arguments:
          parameters:
          - name: message
            value: hello1
        name: hello
        template: whalesay
  - container:
      args:
      - '{{inputs.parameters.message}}'
      command:
      - cowsay
      image: docker/whalesay
      name: ""
      resources: {}
    inputs:
      parameters:
      - name: message
    metadata: {}
    name: whalesay
    outputs: {}
status:
  artifactRepositoryRef:
    artifactRepository: {}
    default: true
  conditions:
  - status: "False"
    type: PodRunning
  - status: "True"
    type: Completed
  finishedAt: "2021-10-04T23:10:22Z"
  nodes:
    steps-grsvt:
      children:
      - steps-grsvt-511517045
      displayName: steps-grsvt
      finishedAt: "2021-10-04T23:10:22Z"
      id: steps-grsvt
      name: steps-grsvt
      outboundNodes:
      - steps-grsvt-1882255119
      phase: Succeeded
      progress: 1/1
      resourcesDuration:
        cpu: 2
        memory: 2
      startedAt: "2021-10-04T23:10:12Z"
      templateName: hello
      templateScope: local/steps-grsvt
      type: Steps
    steps-grsvt-511517045:
      boundaryID: steps-grsvt
      children:
      - steps-grsvt-1882255119
      displayName: '[0]'
      finishedAt: "2021-10-04T23:10:22Z"
      id: steps-grsvt-511517045
      name: steps-grsvt[0]
      phase: Succeeded
      progress: 1/1
      resourcesDuration:
        cpu: 2
        memory: 2
      startedAt: "2021-10-04T23:10:12Z"
      templateScope: local/steps-grsvt
      type: StepGroup
    steps-grsvt-1882255119:
      boundaryID: steps-grsvt
      displayName: hello
      finishedAt: "2021-10-04T23:10:15Z"
      hostNodeName: gke-cloud-training-dev-system-nodes-e86038e5-sgkn
      id: steps-grsvt-1882255119
      inputs:
        parameters:
        - name: message
          value: hello1
      name: steps-grsvt[0].hello
      outputs:
        exitCode: "0"
      phase: Succeeded
      progress: 1/1
      resourcesDuration:
        cpu: 2
        memory: 2
      startedAt: "2021-10-04T23:10:12Z"
      templateName: whalesay
      templateScope: local/steps-grsvt
      type: Pod
  phase: Succeeded
  progress: 1/1
  resourcesDuration:
    cpu: 2
    memory: 2
  startedAt: "2021-10-04T23:10:12Z"

What Kubernetes provider are you using?
GKE v1.19.13-gke.701

What executor are you running? Docker/K8SAPI/Kubelet/PNS/Emissary:
I'm not sure. Whatever the default is.

# Logs from the workflow controller:
kubectl logs -n argo deploy/workflow-controller | grep steps-grsvt
time="2021-10-04T23:10:12.201Z" level=info msg="Processing workflow" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:12.205Z" level=info msg="Updated phase  -> Running" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:12.206Z" level=info msg="Steps node steps-grsvt initialized Running" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:12.206Z" level=info msg="StepGroup node steps-grsvt-511517045 initialized Running" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:12.207Z" level=info msg="Pod node steps-grsvt-1882255119 initialized Pending" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:12.239Z" level=info msg="Created pod: steps-grsvt[0].hello (steps-grsvt-whalesay-1882255119)" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:12.239Z" level=info msg="Workflow step group node steps-grsvt-511517045 not yet completed" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:12.239Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:12.239Z" level=info msg=reconcileAgentPod namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:12.255Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=147692963 workflow=steps-grsvt
time="2021-10-04T23:10:22.245Z" level=info msg="Processing workflow" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.245Z" level=info msg="Updating node steps-grsvt-1882255119 exit code 0" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.245Z" level=info msg="Updating node steps-grsvt-1882255119 status Pending -> Succeeded" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.246Z" level=info msg="Step group node steps-grsvt-511517045 successful" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.246Z" level=info msg="node steps-grsvt-511517045 phase Running -> Succeeded" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.246Z" level=info msg="node steps-grsvt-511517045 finished: 2021-10-04 23:10:22.246827949 +0000 UTC" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.247Z" level=info msg="Outbound nodes of steps-grsvt-1882255119 is [steps-grsvt-1882255119]" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.247Z" level=info msg="Outbound nodes of steps-grsvt is [steps-grsvt-1882255119]" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.247Z" level=info msg="node steps-grsvt phase Running -> Succeeded" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.247Z" level=info msg="node steps-grsvt finished: 2021-10-04 23:10:22.247047835 +0000 UTC" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.247Z" level=info msg="Checking daemoned children of steps-grsvt" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.247Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.247Z" level=info msg=reconcileAgentPod namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.247Z" level=info msg="Updated phase Running -> Succeeded" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.247Z" level=info msg="Marking workflow completed" namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.247Z" level=info msg="Checking daemoned children of " namespace=argo workflow=steps-grsvt
time="2021-10-04T23:10:22.260Z" level=info msg="Workflow update successful" namespace=argo phase=Succeeded resourceVersion=147693045 workflow=steps-grsvt
time="2021-10-04T23:10:22.266Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/steps-grsvt-whalesay-1882255119/labelPodCompleted

# The workflow's pods that are problematic:
kubectl logs -c wait -l workflows.argoproj.io/workflow=steps-grsvt,workflow.argoproj.io/phase!=Succeeded
time="2021-10-04T23:10:15.469Z" level=info msg="listed containers" containers="map[main:{6cf6dbcc7ba905ee50f28c251c16ad587d21bf92c1ea8a3b67eb6c08c13d0c7f Exited {0 63768985813 <nil>}} wait:{129bf08ba92e5f1ae5185a4fb9f9d56788848d8163b67d484cddd9c2ff35a59c Up {0 63768985813 <nil>}}]"
time="2021-10-04T23:10:15.469Z" level=info msg="Main container completed"
time="2021-10-04T23:10:15.469Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-10-04T23:10:15.469Z" level=info msg="No output parameters"
time="2021-10-04T23:10:15.469Z" level=info msg="No output artifacts"
time="2021-10-04T23:10:15.469Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=argo --filter=label=io.kubernetes.pod.name=steps-grsvt-whalesay-1882255119"
time="2021-10-04T23:10:15.477Z" level=info msg="docker ps --all --no-trunc --format={{.Status}}|{{.Label \"io.kubernetes.container.name\"}}|{{.ID}}|{{.CreatedAt}} --filter=label=io.kubernetes.pod.namespace=argo --filter=label=io.kubernetes.pod.name=steps-grsvt-whalesay-1882255119"
time="2021-10-04T23:10:15.504Z" level=info msg="listed containers" containers="map[main:{6cf6dbcc7ba905ee50f28c251c16ad587d21bf92c1ea8a3b67eb6c08c13d0c7f Exited {0 63768985813 <nil>}} wait:{129bf08ba92e5f1ae5185a4fb9f9d56788848d8163b67d484cddd9c2ff35a59c Up {0 63768985813 <nil>}}]"
time="2021-10-04T23:10:15.504Z" level=info msg="Killing sidecars []"
time="2021-10-04T23:10:15.504Z" level=info msg="Alloc=4163 TotalAlloc=9366 Sys=72785 NumGC=3 Goroutines=9"

# Logs from in your workflow's wait container, something like:
kubectl logs -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@alexec alexec added this to the v3.2 milestone Oct 5, 2021
@alexec
Copy link
Contributor

alexec commented Oct 5, 2021

The bug is here:

const [podName, setPodName] = useState(nodeId || '');

It assumes that the pod's name is the same as the node ID, but this is no longer true in v3.2. I think we need to replicate this. logic:

func PodName(workflowName, nodeName, templateName, nodeID string) string {

Would you be interested in providing a fix? Maybe @JPZ13 can help

@JPZ13
Copy link
Member

JPZ13 commented Oct 5, 2021

@sampierson happy to help or take on the issue once I get back from my trip tomorrow. Let me know!

@alexec alexec removed the triage label Oct 6, 2021
@alexec
Copy link
Contributor

alexec commented Oct 6, 2021

I have assigned to you. @sarabala1979 this will be critical issue for most users, so we should look to get into v3.2.0 GA.

@sampierson
Copy link
Author

Workaround:
Using v3.1.13, I edited namespace-install.yaml and changed docker images version mentioned in
Deployments from latest to v3.1.13. This fixed the problem.

Using latest for Docker image versions in versioned product releases is not a Best Practice. Installation of a versioned artifact should produce repeatable results. namespace-install.yaml for v3.1.13 should reference the v3.1.13 Docker images. If that had been the case I wouldn't even have noticed this issue.

@sampierson
Copy link
Author

@JPZ13 no hurry! I have an easy workaround now.

@alexec
Copy link
Contributor

alexec commented Oct 12, 2021

@sarabala1979 can I ask you or @whynowy to fix this issue as I think that @JPZ13 is too busy.

@JPZ13
Copy link
Member

JPZ13 commented Oct 12, 2021

Sorry @alexec. I got tied up with some of the Kubecon travel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants