Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: Logs are not always output as json when configured to do so #11232

Closed
josegonzalez opened this issue Sep 28, 2021 · 6 comments · Fixed by #11291
Closed

server: Logs are not always output as json when configured to do so #11232

josegonzalez opened this issue Sep 28, 2021 · 6 comments · Fixed by #11291

Comments

@josegonzalez
Copy link
Contributor

Nomad version

Nomad v1.1.5 (117a23d2cdf26a1837b21c84f30c8c0f3441e927)

Operating system and Environment details

# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.6 LTS"

Issue

Setting log_json = true still results in certain log messages being sent in non-json format.

Reproduction steps

  1. Set log_json = true in config
  2. Use go-discover format for retry_join in config. I used "provider=mdns service=nomad v6=false"
  3. Deploy a job that uses consul-template

Expected Result

Json logging format for everything.

Actual Result

Some messages are not in json:

# I believe this is hashicorp/go-plugin
Sep 28 03:02:10 dc2 nomad[2335]: 2021-09-28T03:02:10.790Z [DEBUG] plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = transport is closing"
Sep 28 03:02:11 dc2 nomad[2335]: 2021-09-28T03:02:11.153Z [DEBUG] plugin: reattached plugin process exited
Sep 28 03:02:11 dc2 nomad[2335]: 2021-09-28T03:02:11.154Z [DEBUG] plugin: plugin exited

# mdns discovery is in use, so hashicorp/go-discover?
Sep 28 03:01:59 dc2 nomad[2335]: 2021/09/28 03:01:59.447082 [INFO] mdns: Closing client {0xc0010a8550 0xc0010a8558 0xc0010a8560 <nil> 1 0xc000c81920}

# consul-template rendering
Sep 28 03:02:02 dc2 nomad[2335]: 2021/09/28 03:02:02.694079 [INFO] (runner) rendered "(dynamic)" => "/var/lib/nomad/alloc/48f6a15c-1a2f-798a-4557-7dd7a3500a78/caddy/local/Caddyfile"

Job file (if appropriate)

N/A

Nomad Server logs (if appropriate)

Startup logs:

Sep 28 03:01:52 dc2 nomad[2335]: 2021-09-28T03:01:52.146Z [TRACE] plugin.stdio: waiting for stdio data
Sep 28 03:01:52 dc2 nomad[2335]: 2021-09-28T03:01:52.162Z [TRACE] plugin.stdio: waiting for stdio data
Sep 28 03:01:52 dc2 nomad[2335]: 2021-09-28T03:01:52.182Z [TRACE] plugin.stdio: waiting for stdio data
Sep 28 03:01:52 dc2 nomad[2335]: 2021-09-28T03:01:52.265Z [TRACE] plugin.stdio: waiting for stdio data
Sep 28 03:01:54 dc2 nomad[2335]: 2021-09-28T03:01:54.390Z [TRACE] plugin.stdio: waiting for stdio data
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"Loaded configuration from /etc/nomad.d/containerd.hcl, /etc/nomad.d/nomad-client.hcl, /etc/nomad.d/nomad-server.hcl, /etc/nomad.d/nomad.hcl","@module":"agent","@timestamp":"2021-09-28T03:01:35.370393Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"Starting Nomad agent...","@module":"agent","@timestamp":"2021-09-28T03:01:35.371498Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"detected plugin","@module":"agent","@timestamp":"2021-09-28T03:01:35.888752Z","name":"nvidia-gpu","plugin_version":"0.1.0","type":"device"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"detected plugin","@module":"agent","@timestamp":"2021-09-28T03:01:35.888880Z","name":"raw_exec","plugin_version":"0.1.0","type":"driver"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"detected plugin","@module":"agent","@timestamp":"2021-09-28T03:01:35.888923Z","name":"exec","plugin_version":"0.1.0","type":"driver"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"detected plugin","@module":"agent","@timestamp":"2021-09-28T03:01:35.888964Z","name":"qemu","plugin_version":"0.1.0","type":"driver"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"detected plugin","@module":"agent","@timestamp":"2021-09-28T03:01:35.889004Z","name":"java","plugin_version":"0.1.0","type":"driver"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"detected plugin","@module":"agent","@timestamp":"2021-09-28T03:01:35.889047Z","name":"docker","plugin_version":"0.1.0","type":"driver"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"detected plugin","@module":"agent","@timestamp":"2021-09-28T03:01:35.889086Z","name":"containerd-driver","plugin_version":"v0.9.2","type":"driver"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"restored from snapshot","@module":"nomad.raft","@timestamp":"2021-09-28T03:01:36.878921Z","id":"1265-166700-1632690202465"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"initial configuration","@module":"nomad.raft","@timestamp":"2021-09-28T03:01:38.951014Z","index":169623,"servers":"[{Suffrage:Voter ID:e8a0f82e-c49f-183b-9872-9bdd985591ed Address:192.168.1.22:4647} {Suffrage:Voter ID:39e922d7-06aa-c586-4bd3-61e6a7959d50 Address:192.168.1.23:4647} {Suffrage:Voter ID:7ace520d-51fb-9c97-e791-5368f94a79cf Address:192.168.1.24:4647}]"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"entering follower state","@module":"nomad.raft","@timestamp":"2021-09-28T03:01:38.951436Z","follower":{},"leader":""}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"serf: EventMemberJoin: dc2.global 192.168.1.22","@module":"nomad","@timestamp":"2021-09-28T03:01:39.034293Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"serf: Attempting re-join to previously known node: dc3.global: 192.168.1.23:4648","@module":"nomad","@timestamp":"2021-09-28T03:01:39.035054Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"starting scheduling worker(s)","@module":"nomad","@timestamp":"2021-09-28T03:01:39.035401Z","num_workers":2,"schedulers":["service","batch","system","_core"]}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"using state directory","@module":"client","@timestamp":"2021-09-28T03:01:39.036102Z","state_dir":"/var/lib/nomad/client"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"using alloc directory","@module":"client","@timestamp":"2021-09-28T03:01:39.036470Z","alloc_dir":"/var/lib/nomad/alloc"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"adding server","@module":"nomad","@timestamp":"2021-09-28T03:01:39.037005Z","server":{"Name":"dc2.global","ID":"e8a0f82e-c49f-183b-9872-9bdd985591ed","Region":"global","Datacenter":"datacenter","Port":4647,"Bootstrap":false,"Expect":2,"MajorVersion":1,"MinorVersion":1,"Build":{},"RaftVersion":3,"Addr":{"IP":"192.168.1.22","Port":4647,"Zone":""},"RPCAddr":{"IP":"192.168.1.22","Port":4647,"Zone":""},"Status":1,"NonVoter":false}}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"initialized cpuset cgroup manager","@module":"client.cpuset_manager","@timestamp":"2021-09-28T03:01:39.039238Z","cpuset":"0-1","parent":"/nomad"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"serf: EventMemberJoin: dc4.global 192.168.1.24","@module":"nomad","@timestamp":"2021-09-28T03:01:39.045694Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"warn","@message":"memberlist: Refuting a suspect message (from: dc2.global)","@module":"nomad","@timestamp":"2021-09-28T03:01:39.045862Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"serf: EventMemberJoin: dc3.global 192.168.1.23","@module":"nomad","@timestamp":"2021-09-28T03:01:39.050576Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"serf: Re-joined to previously known node: dc3.global: 192.168.1.23:4648","@module":"nomad","@timestamp":"2021-09-28T03:01:39.051615Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"adding server","@module":"nomad","@timestamp":"2021-09-28T03:01:39.051944Z","server":{"Name":"dc4.global","ID":"7ace520d-51fb-9c97-e791-5368f94a79cf","Region":"global","Datacenter":"datacenter","Port":4647,"Bootstrap":false,"Expect":2,"MajorVersion":1,"MinorVersion":1,"Build":{},"RaftVersion":3,"Addr":{"IP":"192.168.1.24","Port":4647,"Zone":""},"RPCAddr":{"IP":"192.168.1.24","Port":4647,"Zone":""},"Status":1,"NonVoter":false}}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"adding server","@module":"nomad","@timestamp":"2021-09-28T03:01:39.052122Z","server":{"Name":"dc3.global","ID":"39e922d7-06aa-c586-4bd3-61e6a7959d50","Region":"global","Datacenter":"datacenter","Port":4647,"Bootstrap":false,"Expect":2,"MajorVersion":1,"MinorVersion":1,"Build":{},"RaftVersion":3,"Addr":{"IP":"192.168.1.23","Port":4647,"Zone":""},"RPCAddr":{"IP":"192.168.1.23","Port":4647,"Zone":""},"Status":1,"NonVoter":false}}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"cgroups are available","@module":"client.fingerprint_mgr.cgroup","@timestamp":"2021-09-28T03:01:39.503235Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"consul agent is available","@module":"client.fingerprint_mgr.consul","@timestamp":"2021-09-28T03:01:39.782723Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"warn","@message":"unable to parse speed","@module":"client.fingerprint_mgr.network","@timestamp":"2021-09-28T03:01:40.268638Z","device":"lo","path":"/sbin/ethtool"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"warn","@message":"error calling ethtool","@module":"client.fingerprint_mgr.network","@timestamp":"2021-09-28T03:01:40.292465Z","device":"dummy0","error":"exit status 75","path":"/sbin/ethtool"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"warn","@message":"unable to parse speed","@module":"client.fingerprint_mgr.network","@timestamp":"2021-09-28T03:01:40.316037Z","device":"docker0","path":"/sbin/ethtool"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"Vault is available","@module":"client.fingerprint_mgr.vault","@timestamp":"2021-09-28T03:01:40.453262Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"starting plugin manager","@module":"client.plugin","@timestamp":"2021-09-28T03:01:48.472072Z","plugin-type":"csi"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"starting plugin manager","@module":"client.plugin","@timestamp":"2021-09-28T03:01:48.472241Z","plugin-type":"driver"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"starting plugin manager","@module":"client.plugin","@timestamp":"2021-09-28T03:01:48.472707Z","plugin-type":"device"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"node registration complete","@module":"client","@timestamp":"2021-09-28T03:01:54.140636Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"started client","@module":"client","@timestamp":"2021-09-28T03:01:54.375250Z","node_id":"5f797733-ee77-52c6-559c-a828f3749d18"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"Nomad agent configuration:\n","@module":"agent","@timestamp":"2021-09-28T03:01:54.402568Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"   Advertise Addrs: HTTP: 192.168.1.22:4646; RPC: 192.168.1.22:4647; Serf: 192.168.1.22:4648","@module":"agent","@timestamp":"2021-09-28T03:01:54.402642Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"        Bind Addrs: HTTP: 0.0.0.0:4646; RPC: 0.0.0.0:4647; Serf: 0.0.0.0:4648","@module":"agent","@timestamp":"2021-09-28T03:01:54.402681Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"            Client: true","@module":"agent","@timestamp":"2021-09-28T03:01:54.402718Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"         Log Level: INFO","@module":"agent","@timestamp":"2021-09-28T03:01:54.402753Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"            Region: global (DC: datacenter)","@module":"agent","@timestamp":"2021-09-28T03:01:54.402794Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"            Server: true","@module":"agent","@timestamp":"2021-09-28T03:01:54.402831Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"           Version: 1.1.5","@module":"agent","@timestamp":"2021-09-28T03:01:54.402866Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"","@module":"agent","@timestamp":"2021-09-28T03:01:54.402899Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"Nomad agent started! Log data will stream in below:\n","@module":"agent","@timestamp":"2021-09-28T03:01:54.402928Z"}
Sep 28 03:01:54 dc2 nomad[2335]: {"@level":"info","@message":"starting retry join","@module":"agent.joiner","@timestamp":"2021-09-28T03:01:54.445159Z","servers":"provider=mdns service=nomad v6=false"}
Sep 28 03:01:54 dc2 nomad[2335]: 2021/09/28 03:01:54.445997 [ERR] mdns: Failed to bind to udp6 port: listen udp6 [ff02::fb]:5353: setsockopt: no such device
Sep 28 03:01:54 dc2 nomad[2335]: 2021/09/28 03:01:54.496919 [INFO] (runner) creating new runner (dry: false, once: false)
Sep 28 03:01:54 dc2 nomad[2335]: 2021/09/28 03:01:54.499453 [INFO] (runner) creating watcher
Sep 28 03:01:54 dc2 nomad[2335]: 2021/09/28 03:01:54.503440 [INFO] (runner) creating new runner (dry: false, once: false)
Sep 28 03:01:54 dc2 nomad[2335]: 2021/09/28 03:01:54.506417 [INFO] (runner) creating watcher
Sep 28 03:01:54 dc2 nomad[2335]: 2021/09/28 03:01:54.507195 [INFO] (runner) starting
Sep 28 03:01:54 dc2 nomad[2335]: 2021/09/28 03:01:54.509692 [INFO] (runner) starting
Sep 28 03:01:55 dc2 nomad[2335]: 2021/09/28 03:01:55.836461 [INFO] (runner) rendered "(dynamic)" => "/var/lib/nomad/alloc/48f6a15c-1a2f-798a-4557-7dd7a3500a78/caddy/local/Caddyfile"

Nomad Client logs (if appropriate)

N/A

@lgfa29
Copy link
Contributor

lgfa29 commented Sep 28, 2021

Thanks for flagging this us @josegonzalez.

I tracked down the lines you sent to here https://github.com/hashicorp/mdns/blob/master/client.go#L120 and here https://github.com/hashicorp/consul-template/blob/master/manager/runner.go#L176.

As you can notice, they both use log.Printf directly, which means the output goes straight to stdout.

The Consul team has been working on hcat which is the next generation for consul-template that is more suited to be used as a library. From a quick look, it doesn't seem like they are using the log package anymore.

Finding and refactoring other dependencies may be tricky though. I wonder if there's a way we could capture everything the Nomad process would try to write to stdout and filter out non-JSON lines 🤔

@josegonzalez
Copy link
Contributor Author

I think just starting with changing what is in this issue would be a good start, and if there are other locations found in the future, changing those then would be fine?

@lgfa29 lgfa29 self-assigned this Sep 28, 2021
@lgfa29
Copy link
Contributor

lgfa29 commented Oct 2, 2021

I think just starting with changing what is in this issue would be a good start

The problem is that these changes are not in Nomad, but in external libraries, and changing how libraries log messages usually affects most of the code base. Some of this work is already being done, such as in the case of hcat.

What Nomad could do is what I mentioned, which is to filter out, or wrap non-JSON messages. But this could have a performance impact since it would touch every log message, so we will need to scope and investigate this further.

@josegonzalez
Copy link
Contributor Author

That makes sense. Is there some way I can help push this issue forward?

@lgfa29
Copy link
Contributor

lgfa29 commented Oct 8, 2021

It turns out that hclog already provides this functionality, we were just using it wrong 😬

#11291 changes the way it's used so that messages are properly wrapped.

Thanks for reporting this!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants