Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to view log if pod get killed due activeDeadlineSeconds #7081

Closed
tczhao opened this issue Oct 27, 2021 · 6 comments · Fixed by #7093
Closed

Unable to view log if pod get killed due activeDeadlineSeconds #7081

tczhao opened this issue Oct 27, 2021 · 6 comments · Fixed by #7093
Labels

Comments

@tczhao
Copy link
Member

tczhao commented Oct 27, 2021

Summary

What happened/what you expected to happen?

With emissary
If a pod error exit due workflow activeDeadlineSeconds, log is saved.
However, because the output artifact does not appear in workflow.status.nodeid, users are not able to view log from UI artifact not found

Viewing log works If a pod error exit due program error, container log is saved and output field are added in workflow.status.nodeid

What version of Argo Workflows are you running?

~v3.2.1

Diagnostics

Workflow

metadata:
  name: omniscient-bear2
  namespace: default
  labels:
    example: 'true'
spec:
  activeDeadlineSeconds: 100
  arguments:
    parameters:
      - name: message
        value: hello argo
  entrypoint: argosay
  templates:
    - name: argosay
      inputs:
        parameters:
          - name: message
            value: '{{workflow.parameters.message}}'
      container:
        name: main
        image: 'argoproj/argosay:v2'
        command:
          - bash
          - -c
        args:
          - 'echo 123; sleep 10000'

Wait container log

time="2021-10-27T10:46:01.788Z" level=info msg="Starting Workflow Executor" executorType=emissary version=latest+bd4e68a.dirty
time="2021-10-27T10:46:01.794Z" level=info msg="Creating a emissary executor"
time="2021-10-27T10:46:01.794Z" level=info msg="Executor initialized" deadline="2021-10-27 10:47:39 +0000 UTC" includeScriptOutput=false namespace=default podName=omniscient-bear2 template="{\"n
ame\":\"argosay\",\"inputs\":{\"parameters\":[{\"name\":\"message\",\"value\":\"hello argo\"}]},\"outputs\":{},\"metadata\":{},\"container\":{\"name\":\"main\",\"image\":\"argoproj/argosay:v2\",
\"command\":[\"bash\",\"-c\"],\"args\":[\"sleep 10000\"],\"resources\":{}},\"archiveLocation\":{\"archiveLogs\":true,\"gcs\":{\"bucket\":\"wx-57a8a73b-9e03-4f1d-9349-bf67bf6d9c2c\",\"serviceAcco
untKeySecret\":{\"name\":\"service-ac-key\",\"key\":\"service-ac-key.json\"},\"key\":\"pipeline/log/omniscient-bear2/omniscient-bear2\"}}}" version="&Version{Version:latest+bd4e68a.dirty,BuildDa
te:2021-10-24T23:52:23Z,GitCommit:bd4e68a5e2013987ceadc089269f7d0cf61f2cb6,GitTag:untagged,GitTreeState:dirty,GoVersion:go1.16.9,Compiler:gc,Platform:linux/amd64,}"
time="2021-10-27T10:46:01.794Z" level=info msg="Starting deadline monitor"
time="2021-10-27T10:47:39.000Z" level=info msg="Step exceeded its deadline"
time="2021-10-27T10:47:39.035Z" level=info msg="Killing containers"
time="2021-10-27T10:47:39.047Z" level=info msg="Get pods 200"
time="2021-10-27T10:47:39.827Z" level=info msg="Main container completed"
time="2021-10-27T10:47:39.827Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2021-10-27T10:47:39.827Z" level=info msg="Saving logs"
time="2021-10-27T10:47:39.828Z" level=info msg="GCS Save path: /tmp/argo/outputs/logs/main.log, key: pipeline/log/omniscient-bear2/omniscient-bear2/main.log"
time="2021-10-27T10:47:40.081Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2021-10-27T10:47:40.081Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2021-10-27T10:47:40.081Z" level=info msg="No output parameters"
time="2021-10-27T10:47:40.081Z" level=info msg="No output artifacts"
time="2021-10-27T10:47:40.081Z" level=info msg="Annotating pod with output"
time="2021-10-27T10:47:40.096Z" level=info msg="Patch pods 200"
time="2021-10-27T10:47:40.097Z" level=info msg="Killing sidecars []"
time="2021-10-27T10:47:40.097Z" level=info msg="Alloc=26922 TotalAlloc=33488 Sys=74321 NumGC=5 Goroutines=8"

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@sarabala1979
Copy link
Member

@tczhao Can you try v3.2.3 which has POD Name fix?

@tczhao
Copy link
Member Author

tczhao commented Oct 29, 2021

doesn't work with the current master, I don't think it's related to pod name
I will put up my investigation later

@tczhao
Copy link
Member Author

tczhao commented Nov 16, 2021

image

I hope the above diagram is detailed enough in explaining what happened

Because of network delays and queueing, the scenario that commonly happens to emissary will also happen in pns/k8sapi

@alexec
Copy link
Contributor

alexec commented Nov 22, 2021

@tczhao we struggled to understand the problem, so let me play it back:

When a pod exceeds its deadline, the main.log is not save in the artefact repository.

Why does this happen please?

@tczhao
Copy link
Member Author

tczhao commented Nov 25, 2021

Hi @alexec
Sorry the diagram was not quite right
here's what's happening:
image

  • in pns ( with k8sapi/pns executor should be able to save log when deadlineExceeded #7092 fix ), the executors runs faster and at deadlineexceed event, it annotates pod before controller start to accessNodeStatus
  • in emissary, controller start to accessNodeStatus before the executor annotate the log
  • because the pod status is failed the controller then go on to delete pod, the informer won't receive futher pod update event (e.g. after executor finish the process)

The proposed solution in the PR (for emissary part) is to

  • in case of deadlineExceed, ignore pod update event (when pod turns into deadlineExceed state) and process the pod update event after (wait container finishes)

logs after deadlineExceeded

Emissary
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:03.853Z" level=info msg="Step terminated"
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:03.853Z" level=info msg="Killing containers"
�[36margo-server | �[mtime="2021-11-25T00:46:03.856Z" level=debug msg="Log line" content="Error: signal: terminated" namespace=argo podName=omniscient-bear5 timestamp="2021-11-25 00:46:03.855864043 +0000 UTC" workflow=omniscient-bear5
�[33m       logs | �[momniscient-bear5 main Error: signal: terminated
�[32m controller | �[mtime="2021-11-25T00:46:03.867Z" level=info msg=---------------Informer-Update
�[36margo-server | �[mtime="2021-11-25T00:46:03.868Z" level=debug msg="Pod event" eventType=MODIFIED namespace=argo phase=Failed podName=omniscient-bear5 workflow=omniscient-bear5
�[36margo-server | �[mtime="2021-11-25T00:46:03.868Z" level=debug msg="Ensuring pod logs stream" alreadyStreaming=true namespace=argo podName=omniscient-bear5 podPhase=Failed workflow=omniscient-bear5
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:03.883Z" level=info msg="Get pods 200"
�[32m controller | �[mtime="2021-11-25T00:46:03.968Z" level=info msg="Processing workflow" namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg=---------------assessNodeStatus
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg=Failed
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg="Pod failed: Pod was active on the node longer than the specified deadline" displayName=omniscient-bear5 namespace=argo pod=omniscient-bear5 templateName=argosay workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg="Setting node omniscient-bear5 outputs: " namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg="Updating node omniscient-bear5 status Error -> Failed" namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg="Updating node omniscient-bear5 message: Pod was active on the node longer than the specified deadline" namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg="pod progress" namespace=argo progress=1/1 workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=debug msg="Evaluating node omniscient-bear5: template: *v1alpha1.WorkflowStep (argosay), boundaryID: " namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo,name=omniscient-bear5)" depth=0 tmpl="*v1alpha1.WorkflowStep (argosay)"
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo,name=omniscient-bear5)" depth=0 tmpl="*v1alpha1.WorkflowStep (argosay)"
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo,name=omniscient-bear5)" depth=0 tmpl="*v1alpha1.WorkflowStep (argosay)"
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=debug msg="Node omniscient-bear5 already completed" namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg=reconcileAgentPod namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg="Updated phase Running -> Failed" namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg="Updated message  -> Pod was active on the node longer than the specified deadline" namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg="Marking workflow completed" namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg="Marking workflow as pending archiving" namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=info msg="Checking daemoned children of " namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=debug msg="Event(v1.ObjectReference{Kind:\"Workflow\", Namespace:\"argo\", Name:\"omniscient-bear5\", UID:\"8c5eca61-96bf-4db8-9a78-c6b70bcead9c\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"2933\", FieldPath:\"\"}): type: 'Warning' reason: 'WorkflowFailed' Pod was active on the node longer than the specified deadline"
�[32m controller | �[mtime="2021-11-25T00:46:03.969Z" level=debug msg="Log changes patch: {\"metadata\":{\"labels\":{\"workflows.argoproj.io/completed\":\"true\",\"workflows.argoproj.io/phase\":\"Failed\",\"workflows.argoproj.io/workflow-archiving-status\":\"Pending\"}},\"status\":{\"conditions\":[{\"status\":\"False\",\"type\":\"PodRunning\"},{\"status\":\"True\",\"type\":\"Completed\"}],\"finishedAt\":\"2021-11-25T00:46:03Z\",\"message\":\"Pod was active on the node longer than the specified deadline\",\"nodes\":{\"omniscient-bear5\":{\"finishedAt\":\"2021-11-25T00:46:03Z\",\"message\":\"Pod was active on the node longer than the specified deadline\",\"outputs\":{},\"phase\":\"Failed\",\"progress\":\"1/1\",\"resourcesDuration\":{\"cpu\":0,\"memory\":0}}},\"phase\":\"Failed\"}}"
�[32m controller | �[mtime="2021-11-25T00:46:03.973Z" level=info msg="Create events 201"
�[32m controller | �[mtime="2021-11-25T00:46:03.975Z" level=info msg="Update workflows 200"
�[32m controller | �[mtime="2021-11-25T00:46:03.975Z" level=info msg="Workflow update successful" namespace=argo phase=Failed resourceVersion=2950 workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.975Z" level=debug msg="Event(v1.ObjectReference{Kind:\"Workflow\", Namespace:\"argo\", Name:\"omniscient-bear5\", UID:\"8c5eca61-96bf-4db8-9a78-c6b70bcead9c\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"2950\", FieldPath:\"\"}): type: 'Warning' reason: 'WorkflowNodeFailed' Failed node omniscient-bear5: Pod was active on the node longer than the specified deadline"
�[36margo-server | �[mtime="2021-11-25T00:46:03.976Z" level=debug msg="Workflow event" completed=true eventType=MODIFIED namespace=argo workflow=omniscient-bear5
�[36margo-server | �[mtime="2021-11-25T00:46:03.976Z" level=debug msg="Done watching workflow events" namespace=argo workflow=omniscient-bear5
�[36margo-server | �[mtime="2021-11-25T00:46:03.976Z" level=debug msg="Done watching pod events" namespace=argo workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.976Z" level=info msg="archiving workflow" namespace=argo uid=8c5eca61-96bf-4db8-9a78-c6b70bcead9c workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.976Z" level=debug msg="Archiving workflow" labels="map[example:true workflows.argoproj.io/completed:true workflows.argoproj.io/phase:Failed workflows.argoproj.io/workflow-archiving-status:Pending]" uid=8c5eca61-96bf-4db8-9a78-c6b70bcead9c
�[36margo-server | �[mtime="2021-11-25T00:46:03.976Z" level=debug msg="Received workflow event"
�[36margo-server | �[mtime="2021-11-25T00:46:03.976Z" level=debug msg="Sending workflow event" phase=Failed type=MODIFIED workflow=omniscient-bear5
�[36margo-server | �[mtime="2021-11-25T00:46:03.976Z" level=debug msg="Received workflow event"
�[36margo-server | �[mtime="2021-11-25T00:46:03.976Z" level=debug msg="Sending workflow event" phase=Failed type=MODIFIED workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:03.978Z" level=info msg="Create events 201"
�[32m controller | �[mtime="2021-11-25T00:46:03.980Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/omniscient-bear5/labelPodCompleted
�[32m controller | �[mtime="2021-11-25T00:46:03.987Z" level=info msg="Patch pods 200"
�[32m controller | �[mtime="2021-11-25T00:46:03.989Z" level=info msg=---------------Informer-Delete
�[32m controller | �[mtime="2021-11-25T00:46:04.007Z" level=info msg="Patch workflows 200"
�[32m controller | �[mtime="2021-11-25T00:46:04.008Z" level=info msg="archiving workflow" namespace=argo uid=8c5eca61-96bf-4db8-9a78-c6b70bcead9c workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:04.008Z" level=debug msg="Archiving workflow" labels="map[example:true workflows.argoproj.io/completed:true workflows.argoproj.io/phase:Failed workflows.argoproj.io/workflow-archiving-status:Pending]" uid=8c5eca61-96bf-4db8-9a78-c6b70bcead9c
�[36margo-server | �[mtime="2021-11-25T00:46:04.008Z" level=debug msg="Received workflow event"
�[36margo-server | �[mtime="2021-11-25T00:46:04.008Z" level=debug msg="Sending workflow event" phase=Failed type=MODIFIED workflow=omniscient-bear5
�[36margo-server | �[mtime="2021-11-25T00:46:04.008Z" level=debug msg="Received workflow event"
�[32m controller | �[mtime="2021-11-25T00:46:04.008Z" level=info msg="Queueing Failed workflow argo/omniscient-bear5 for delete in 9m59s"
�[32m controller | �[mtime="2021-11-25T00:46:04.008Z" level=info msg="Queueing Failed workflow argo/omniscient-bear5 for delete in 9m59s"
�[36margo-server | �[mtime="2021-11-25T00:46:04.009Z" level=debug msg="Sending workflow event" phase=Failed type=MODIFIED workflow=omniscient-bear5
�[32m controller | �[mtime="2021-11-25T00:46:04.037Z" level=info msg="Patch workflows 200"
�[36margo-server | �[mtime="2021-11-25T00:46:04.496Z" level=debug msg="Sending entry" content="Error: signal: terminated" namespace=argo timestamp="2021-11-25 00:46:03.855864043 +0000 UTC" workflow=omniscient-bear5
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.758Z" level=info msg="Main container completed"
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.758Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.758Z" level=info msg="Saving logs"
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.759Z" level=info msg="S3 Save path: /tmp/argo/outputs/logs/main.log, key: omniscient-bear5/omniscient-bear5/main.log"
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.759Z" level=info msg="Creating minio client using static credentials" endpoint="minio:9000"
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.759Z" level=info msg="Saving file to s3" bucket=my-bucket endpoint="minio:9000" key=omniscient-bear5/omniscient-bear5/main.log path=/tmp/argo/outputs/logs/main.log
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.800Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.800Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.800Z" level=info msg="No output parameters"
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.800Z" level=info msg="No output artifacts"
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.800Z" level=info msg="Annotating pod with output"
�[33m       logs | �[momniscient-bear5 wait time="2021-11-25T00:46:04.807Z" level=info msg=make: *** [Makefile:443: start] Error 130
PNS with #7092 fix
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.853Z" level=info msg="Step terminated"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.854Z" level=info msg="Killing containers"
�[36margo-server | �[mtime="2021-11-25T00:55:07.862Z" level=debug msg="Pod event" eventType=MODIFIED namespace=argo phase=Failed podName=omniscient-bear2 workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.862Z" level=debug msg="Ensuring pod logs stream" alreadyStreaming=true namespace=argo podName=omniscient-bear2 podPhase=Failed workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.862Z" level=info msg=---------------Informer-Update
�[36margo-server | �[mtime="2021-11-25T00:55:07.862Z" level=debug msg="Pod event" eventType=MODIFIED namespace=argo phase=Failed podName=omniscient-bear2 workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.862Z" level=debug msg="Ensuring pod logs stream" alreadyStreaming=true namespace=argo podName=omniscient-bear2 podPhase=Failed workflow=omniscient-bear2
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.862Z" level=info msg="Main container completed"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.862Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.862Z" level=info msg="Saving logs"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.863Z" level=info msg="Getting output of main"
�[36margo-server | �[mtime="2021-11-25T00:55:07.863Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.862Z\" level=info msg=\"Main container completed\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.862854516 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.863Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.862Z\" level=info msg=\"No Script output reference in workflow. Capturing script output ignored\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.862889132 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.863Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.862Z\" level=info msg=\"Saving logs\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.862897521 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.863Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.863Z\" level=info msg=\"Getting output of main\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.863331624 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.871Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.871Z\" level=info msg=\"List pods/log 200\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.871495568 +0000 UTC" workflow=omniscient-bear2
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.871Z" level=info msg="List pods/log 200"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.872Z" level=info msg="S3 Save path: /tmp/argo/outputs/logs/main.log, key: omniscient-bear2/omniscient-bear2/main.log"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.872Z" level=info msg="Creating minio client using static credentials" endpoint="minio:9000"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.872Z" level=info msg="Saving file to s3" bucket=my-bucket endpoint="minio:9000" key=omniscient-bear2/omniscient-bear2/main.log path=/tmp/argo/outputs/logs/main.log
�[36margo-server | �[mtime="2021-11-25T00:55:07.872Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.872Z\" level=info msg=\"S3 Save path: /tmp/argo/outputs/logs/main.log, key: omniscient-bear2/omniscient-bear2/main.log\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.872080418 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.872Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.872Z\" level=info msg=\"Creating minio client using static credentials\" endpoint=\"minio:9000\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.872152637 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.872Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.872Z\" level=info msg=\"Saving file to s3\" bucket=my-bucket endpoint=\"minio:9000\" key=omniscient-bear2/omniscient-bear2/main.log path=/tmp/argo/outputs/logs/main.log" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.872210244 +0000 UTC" workflow=omniscient-bear2
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.875Z" level=info msg="Get pods 200"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.876Z" level=info msg="Sending SIGTERM to pid 26"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.876Z" level=warning msg="Failed to SIGTERM pid 26: os: process already finished"
�[36margo-server | �[mtime="2021-11-25T00:55:07.878Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.875Z\" level=info msg=\"Get pods 200\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.875645261 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.878Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.876Z\" level=info msg=\"Sending SIGTERM to pid 26\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.876371064 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.878Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.876Z\" level=warning msg=\"Failed to SIGTERM pid 26: os: process already finished\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.876392213 +0000 UTC" workflow=omniscient-bear2
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.911Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.911Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.911Z" level=info msg="No output parameters"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.911Z" level=info msg="No output artifacts"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.911Z" level=info msg="Annotating pod with output"
�[36margo-server | �[mtime="2021-11-25T00:55:07.912Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.911Z\" level=info msg=\"not deleting local artifact\" localArtPath=/tmp/argo/outputs/logs/main.log" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.911598088 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.912Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.911Z\" level=info msg=\"Successfully saved file: /tmp/argo/outputs/logs/main.log\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.911642703 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.912Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.911Z\" level=info msg=\"No output parameters\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.911649827 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.912Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.911Z\" level=info msg=\"No output artifacts\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.911653686 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.912Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.911Z\" level=info msg=\"Annotating pod with output\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.911657055 +0000 UTC" workflow=omniscient-bear2
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.917Z" level=info msg="Patch pods 200"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.918Z" level=info msg="Killing sidecars []"
�[33m       logs | �[momniscient-bear2 wait time="2021-11-25T00:55:07.918Z" level=info msg="Alloc=5905 TotalAlloc=11763 Sys=19154 NumGC=4 Goroutines=14"
�[36margo-server | �[mtime="2021-11-25T00:55:07.919Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.917Z\" level=info msg=\"Patch pods 200\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.917520244 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.919Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.918Z\" level=info msg=\"Killing sidecars []\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.918217021 +0000 UTC" workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.920Z" level=debug msg="Pod event" eventType=MODIFIED namespace=argo phase=Failed podName=omniscient-bear2 workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.920Z" level=debug msg="Pod event" eventType=MODIFIED namespace=argo phase=Failed podName=omniscient-bear2 workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.920Z" level=debug msg="Ensuring pod logs stream" alreadyStreaming=true namespace=argo podName=omniscient-bear2 podPhase=Failed workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.920Z" level=debug msg="Ensuring pod logs stream" alreadyStreaming=true namespace=argo podName=omniscient-bear2 podPhase=Failed workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.920Z" level=info msg=---------------Informer-Update
�[36margo-server | �[mtime="2021-11-25T00:55:07.920Z" level=debug msg="Log line" content="time=\"2021-11-25T00:55:07.918Z\" level=info msg=\"Alloc=5905 TotalAlloc=11763 Sys=19154 NumGC=4 Goroutines=14\"" namespace=argo podName=omniscient-bear2 timestamp="2021-11-25 00:55:07.918405909 +0000 UTC" workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.962Z" level=info msg="Processing workflow" namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg=---------------assessNodeStatus
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg=Failed
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg="Pod failed: Pod was active on the node longer than the specified deadline" displayName=omniscient-bear2 namespace=argo pod=omniscient-bear2 templateName=argosay workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg="Setting node omniscient-bear2 outputs: {\"artifacts\":[{\"name\":\"main-logs\",\"s3\":{\"key\":\"omniscient-bear2/omniscient-bear2/main.log\"}}]}" namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg="Updating node omniscient-bear2 status Running -> Failed" namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg="Updating node omniscient-bear2 message: Pod was active on the node longer than the specified deadline" namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg="pod progress" namespace=argo progress=1/1 workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=debug msg="Evaluating node omniscient-bear2: template: *v1alpha1.WorkflowStep (argosay), boundaryID: " namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo,name=omniscient-bear2)" depth=0 tmpl="*v1alpha1.WorkflowStep (argosay)"
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo,name=omniscient-bear2)" depth=0 tmpl="*v1alpha1.WorkflowStep (argosay)"
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo,name=omniscient-bear2)" depth=0 tmpl="*v1alpha1.WorkflowStep (argosay)"
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=debug msg="Node omniscient-bear2 already completed" namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg=reconcileAgentPod namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg="Updated phase Running -> Failed" namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg="Updated message  -> Pod was active on the node longer than the specified deadline" namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg="Marking workflow completed" namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg="Marking workflow as pending archiving" namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=info msg="Checking daemoned children of " namespace=argo workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.963Z" level=debug msg="Event(v1.ObjectReference{Kind:\"Workflow\", Namespace:\"argo\", Name:\"omniscient-bear2\", UID:\"ba310220-ee46-4249-ae64-a334259f581d\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"3383\", FieldPath:\"\"}): type: 'Warning' reason: 'WorkflowFailed' Pod was active on the node longer than the specified deadline"
�[32m controller | �[mtime="2021-11-25T00:55:07.964Z" level=debug msg="Log changes patch: {\"metadata\":{\"labels\":{\"workflows.argoproj.io/completed\":\"true\",\"workflows.argoproj.io/phase\":\"Failed\",\"workflows.argoproj.io/workflow-archiving-status\":\"Pending\"}},\"status\":{\"conditions\":[{\"status\":\"False\",\"type\":\"PodRunning\"},{\"status\":\"True\",\"type\":\"Completed\"}],\"finishedAt\":\"2021-11-25T00:55:07Z\",\"message\":\"Pod was active on the node longer than the specified deadline\",\"nodes\":{\"omniscient-bear2\":{\"finishedAt\":\"2021-11-25T00:55:07Z\",\"message\":\"Pod was active on the node longer than the specified deadline\",\"outputs\":{\"artifacts\":[{\"name\":\"main-logs\",\"s3\":{\"key\":\"omniscient-bear2/omniscient-bear2/main.log\"}}]},\"phase\":\"Failed\",\"progress\":\"1/1\",\"resourcesDuration\":{\"cpu\":0,\"memory\":0}}},\"phase\":\"Failed\"}}"
�[32m controller | �[mtime="2021-11-25T00:55:07.967Z" level=info msg="Create events 201"
�[32m controller | �[mtime="2021-11-25T00:55:07.969Z" level=info msg="Update workflows 200"
�[32m controller | �[mtime="2021-11-25T00:55:07.970Z" level=info msg="Workflow update successful" namespace=argo phase=Failed resourceVersion=3402 workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.970Z" level=debug msg="Event(v1.ObjectReference{Kind:\"Workflow\", Namespace:\"argo\", Name:\"omniscient-bear2\", UID:\"ba310220-ee46-4249-ae64-a334259f581d\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"3402\", FieldPath:\"\"}): type: 'Warning' reason: 'WorkflowNodeFailed' Failed node omniscient-bear2: Pod was active on the node longer than the specified deadline"
�[36margo-server | �[mtime="2021-11-25T00:55:07.970Z" level=debug msg="Received workflow event"
�[36margo-server | �[mtime="2021-11-25T00:55:07.970Z" level=debug msg="Sending workflow event" phase=Failed type=MODIFIED workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.971Z" level=debug msg="Received workflow event"
�[36margo-server | �[mtime="2021-11-25T00:55:07.971Z" level=debug msg="Workflow event" completed=true eventType=MODIFIED namespace=argo workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.971Z" level=debug msg="Done watching workflow events" namespace=argo workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.971Z" level=debug msg="Done watching pod events" namespace=argo workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.971Z" level=debug msg="Workflow event" completed=true eventType=MODIFIED namespace=argo workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.971Z" level=debug msg="Done watching workflow events" namespace=argo workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.971Z" level=debug msg="Done watching pod events" namespace=argo workflow=omniscient-bear2
�[36margo-server | �[mtime="2021-11-25T00:55:07.971Z" level=debug msg="Sending workflow event" phase=Failed type=MODIFIED workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.972Z" level=info msg="archiving workflow" namespace=argo uid=ba310220-ee46-4249-ae64-a334259f581d workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:07.972Z" level=debug msg="Archiving workflow" labels="map[example:true workflows.argoproj.io/completed:true workflows.argoproj.io/container-runtime-executor:pns workflows.argoproj.io/phase:Failed workflows.argoproj.io/workflow-archiving-status:Pending]" uid=ba310220-ee46-4249-ae64-a334259f581d
�[32m controller | �[mtime="2021-11-25T00:55:07.973Z" level=info msg="Create events 201"
�[32m controller | �[mtime="2021-11-25T00:55:07.977Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/omniscient-bear2/labelPodCompleted
�[32m controller | �[mtime="2021-11-25T00:55:07.983Z" level=info msg="Patch pods 200"
�[32m controller | �[mtime="2021-11-25T00:55:07.985Z" level=info msg=---------------Informer-Delete
�[32m controller | �[mtime="2021-11-25T00:55:08.009Z" level=info msg="Patch workflows 200"
�[32m controller | �[mtime="2021-11-25T00:55:08.010Z" level=info msg="archiving workflow" namespace=argo uid=ba310220-ee46-4249-ae64-a334259f581d workflow=omniscient-bear2
�[32m controller | �[mtime="2021-11-25T00:55:08.010Z" level=debug msg="Archiving workflow" labels="map[example:true workflows.argoproj.io/completed:true workflows.argoproj.io/container-runtime-executor:pns workflows.argoproj.io/phase:Failed workflows.argoproj.io/workflow-archiving-status:Pending]" uid=ba310220-ee46-4249-ae64-a334259f581d

@no-response
Copy link

no-response bot commented Dec 6, 2021

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.

@no-response no-response bot closed this as completed Dec 6, 2021
alexec pushed a commit that referenced this issue Mar 2, 2022
#7081 (#7093)

Signed-off-by: Tianchu Zhao <evantczhao@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants