Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logSnippet error reading termination message from logs #15878

Closed
andrewklau opened this issue Aug 21, 2017 · 11 comments · Fixed by #16912
Closed

logSnippet error reading termination message from logs #15878

andrewklau opened this issue Aug 21, 2017 · 11 comments · Fixed by #16912

Comments

@andrewklau
Copy link
Contributor

None of my failed builds ever seem to get this new logSnippet, it always returns:

  logSnippet: >-
    Error on reading termination message from logs: failed to
    ...209-06579dce22e7/sti-build_0.log: no such file or directory
Version

3.6

@bparees
Copy link
Contributor

bparees commented Aug 22, 2017

can you share one of your build json objects?

@bparees
Copy link
Contributor

bparees commented Aug 22, 2017

I tested this with "oc cluster up --version=v3.6.0" and created a build w/ a non-existent repository name and it seems to work ok:

$ oc describe build cakephp-ex-3
Name:		cakephp-ex-3
Namespace:	myproject
Created:	38 seconds ago
Labels:		app=cakephp-ex
		buildconfig=cakephp-ex
		openshift.io/build-config.name=cakephp-ex
		openshift.io/build.start-policy=Serial
Annotations:	build.openshift.io/accepted=8d780e50-da5e-4fb5-a956-b83d1be22fd5
		openshift.io/build-config.name=cakephp-ex
		openshift.io/build.number=3
		openshift.io/build.pod-name=cakephp-ex-3-build

Status:		Failed (Failed to fetch the input source.)
Started:	Tue, 22 Aug 2017 18:17:53 EDT
Duration:	5s

Build Config:	cakephp-ex
Build Pod:	cakephp-ex-3-build

Strategy:	Source
URL:		https://github.com/openshift/cakphp-ex
From Image:	DockerImage centos/php-70-centos7@sha256:b66daf9a1d08079d608055caefbf39775a85523b0990262424f8fa0b6ba4f2e1
Output to:	ImageStreamTag cakephp-ex:latest
Push Secret:	builder-dockercfg-r19lv

Build trigger cause:	Manually triggered

Log Tail:	Cloning "https://github.com/openshift/cakphp-ex" ...
		error: build error: failed to fetch requested repository "...//github.com/openshift/cakphp-ex" with provided credentials

so i'd like to know your exact cluster version and also the full build yaml so i can see why your build failed in case it failed in a way that we'd be unable to get a log snippet for.

@smarterclayton
Copy link
Contributor

I've seen this happen as well, could be a race condition between pod shutdown and container log cleanup. Or the container exits so fast no logs are reported.

@bparees
Copy link
Contributor

bparees commented Aug 23, 2017

I've seen this happen as well, could be a race condition between pod shutdown and container log cleanup.

if it's that one it sounds like something that should be an upstream k8s issue since it means the requested termination logs api isn't really being respected, no?

Or the container exits so fast no logs are reported.

@andrewklau can you tell us if you're able to manually retrieve logs for the build or build pod when this happens?

@andrewklau
Copy link
Contributor Author

yeah I was able to get the logs of the failed build, just not from the snippet.

I'll get you the build yaml when I'm back.

@andrewklau
Copy link
Contributor Author

I think this one failed due to oom, but hard to tell. Increasing the mem did fix it though.

Name:		app-9
Namespace:	app
Created:	7 hours ago
Labels:		app=app
		buildconfig=app
		openshift.io/build-config.name=app
		openshift.io/build.start-policy=Serial
Annotations:	openshift.io/build-config.name=app
		openshift.io/build.number=9
		openshift.io/build.pod-name=app-9-build

Status:			Failed (Assemble script failed.)
Started:		Wed, 23 Aug 2017 00:10:09 UTC
Duration:		1m42s
  FetchInputs:		  27s
  PullImages:		  0s
  RetrieveArtifacts:	  47s

Build Config:	app
Build Pod:	app-9-build

Strategy:		Source
URL:			x
Ref:			master
Source Secret:		x
Commit:			x
Author/Committer:	x
From Image:		DockerImage openshift/php-70-centos7@sha256:335cd5a4514a3e8efed4bb4883e5d8a0e2292976e6ed6f12217700647664ae7c
Incremental Build:	yes
Output to:		ImageStreamTag app:latest
Push Secret:		builder-dockercfg-9zq3z

Build trigger cause:	Manually triggered

Log Tail:	Error on reading termination message from logs: failed to ...209-06579dce22e7/sti-build_0.log: no such file or directory
Events:		<none>

@bparees
Copy link
Contributor

bparees commented Aug 23, 2017

I think this one failed due to oom, but hard to tell. Increasing the mem did fix it though.

but you were able to get the build/pod logs for it?

can you share the build pod json? in particular i'm interested in the pod.Status.ContainerStatuses.[container].State.Terminated.Message

That is the field we are retrieving. If it's not set correctly, then it seems like a k8s bug.

@andrewklau
Copy link
Contributor Author

Yeah the error is in the pod json too. I was able to get the logs at the time of the error. However the logs are now gone since the host has already cleaned up the container.

    state:
      terminated:
        containerID: docker://247f65da3468df11b70d7ff68ab4c83e8efdb6ea0daefa31cc25baa58993f46f
        exitCode: 255
        finishedAt: 2017-08-23T00:10:52Z
        message: 'Error on reading termination message from logs: failed to open log
          file "/var/log/pods/48d39c32-8797-11e7-9209-06579dce22e7/sti-build_0.log":
          open /var/log/pods/48d39c32-8797-11e7-9209-06579dce22e7/sti-build_0.log:
          no such file or directory'
        reason: Error
        startedAt: 2017-08-23T00:09:16Z

@andrewklau
Copy link
Contributor Author

On the node /var/log/pods is empty although the folders exist:

du -h /var/log/pods/
0	/var/log/pods/931b4cd9-7f5e-11e7-b321-06579dce22e7
0	/var/log/pods/932bb8e4-7f5e-11e7-b321-06579dce22e7
0	/var/log/pods/931c48c9-7f5e-11e7-b321-06579dce22e7
0	/var/log/pods/ed378a85-7f67-11e7-b321-06579dce22e7
0	/var/log/pods/9f9597c6-7f69-11e7-b321-06579dce22e7
0	/var/log/pods/9f92b24d-7f69-11e7-b321-06579dce22e7
0	/var/log/pods/9fe801af-7f69-11e7-b321-06579dce22e7
0	/var/log/pods/9fbe05bd-7f69-11e7-b321-06579dce22e7
0	/var/log/pods/a0060127-7f69-11e7-b321-06579dce22e7
0	/var/log/pods/9fc0670b-7f69-11e7-b321-06579dce22e7
0	/var/log/pods/c3dae249-7f6a-11e7-b321-06579dce22e7
0	/var/log/pods/b3c66488-813c-11e7-9209-06579dce22e7
0	/var/log/pods/b3e485c0-813c-11e7-9209-06579dce22e7
4.0K	/var/log/pods/

@bparees
Copy link
Contributor

bparees commented Aug 24, 2017

ok, given that this appears in the pod itself, this is a k8s issue w/ the last termination log feature and needs to be opened as an upstream issue.

@bparees bparees closed this as completed Aug 24, 2017
@joelsmith
Copy link
Contributor

Upstream issue filed here:
kubernetes/kubernetes#52502
Possible upstream fix posted here:
kubernetes/kubernetes#52503

openshift-merge-robot added a commit that referenced this issue Oct 20, 2017
Automatic merge from submit-queue.

UPSTREAM: 52503: Get fallback termination msg from docker when using journald log driver

xref kubernetes/kubernetes#52503

From the commit message:

> When using the legacy docker container runtime and when a container has
> terminationMessagePolicy=FallbackToLogsOnError and when docker is
> configured with a log driver other than json-log (such as journald),
> the kubelet should not try to get the container's log from the
> json log file (since it's not there) but should instead ask docker for
> the logs.

fixes #15878
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants