Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agent: add option to discard health output #3562

Merged
merged 7 commits into from
Oct 11, 2017
Merged

Conversation

magiconair
Copy link
Contributor

In high volatile environments consul will have checks with "noisy"
output which changes every time even though the status does not change.
Since the output is stored in the raft log every health check update
unblocks a blocking call on health checks since the raft index has
changed even though the status of the health checks may not have changed
at all. By discarding the output of the health checks the users can
choose a different tradeoff. Less visibility on why a check failed in
exchange for a reduced change rate on the raft log.

See fabiolb/fabio#368

@magiconair
Copy link
Contributor Author

This still needs a test and testing :)

@magiconair magiconair force-pushed the discard-health-output branch from 47d9105 to a744596 Compare October 10, 2017 01:59
@slackpad slackpad added this to the 1.0 milestone Oct 10, 2017
@magiconair
Copy link
Contributor Author

tests are done.

@magiconair magiconair requested a review from slackpad October 10, 2017 16:54
@magiconair magiconair self-assigned this Oct 10, 2017
@manos
Copy link

manos commented Oct 10, 2017

@magiconair thank you! 🥇

magiconair and others added 7 commits October 10, 2017 17:04
In high volatile environments consul will have checks with "noisy"
output which changes every time even though the status does not change.
Since the output is stored in the raft log every health check update
unblocks a blocking call on health checks since the raft index has
changed even though the status of the health checks may not have changed
at all. By discarding the output of the health checks the users can
choose a different tradeoff. Less visibility on why a check failed in
exchange for a reduced change rate on the raft log.
@slackpad slackpad force-pushed the discard-health-output branch from 4f1f7c0 to 7894536 Compare October 11, 2017 00:04
@slackpad slackpad merged commit 94f5819 into master Oct 11, 2017
@slackpad slackpad deleted the discard-health-output branch October 11, 2017 00:04
@123BLiN
Copy link

123BLiN commented Oct 17, 2017

@magiconair Thank for this fix! We've been waiting for it since the version 0.6.3 :)
However one question - will the health check output be visible on the agent in order to query/log it for troubleshooting?

Thanks!
Roman

@magiconair
Copy link
Contributor Author

@123BLiN No. That's the tradeoff.

@123BLiN
Copy link

123BLiN commented Oct 20, 2017

@magiconair
I may confirm thta you fix is working as expected, however I see health checks output on the agent itself with the consul monitor -log-level-DEBUG :) and this is awesome :)

@magiconair
Copy link
Contributor Author

Oh this is cool. That didn't even occur to me.

michaelw pushed a commit to michaelw/consul that referenced this pull request Jan 11, 2018
Version 1.0.0

* tag 'v1.0.0': (455 commits)
  Release v1.0.0
  Puts the tree in 1.0 final release mode.
  Fixes an XSS issue with unescaped node names. (hashicorp#3578)
  Adds a note about the Raft protocol not being the same as the Consul protocol.
  Adds a 1.0 section to the upgrade guide and cleans up the change log.
  Update sentinel.html.markdown.erb
  Update dns forwarding documentation (hashicorp#3574)
  Adds a brief wait and poll period to update check status after a timeout. (hashicorp#3573)
  Cleans up some drift between the OSS and Enterprise trees.
  Clarify the docs around script check timeout behavior
  Updates the change log.
  Kill check processes after the timeout is reached (hashicorp#3567)
  Updates the change log.
  retry locks on network errors (hashicorp#3553)
  Fix example code formatting in godoc
  config: remove redundant code
  config: fix check for segment.port <= 0 and add test
  Adds check to make sure port is given so we avoid a nil bind address.
  Removes obsolete segment stub.
  agent: add option to discard health output (hashicorp#3562)
  ...
@sandstrom
Copy link
Contributor

@magiconair If I have a check like this:

{
  "args": [
    "/bin/sh",
    "-c",
    "\"curl --silent http://localhost:9200/_cluster/health | grep green\""
  ],
  "interval": "10s"
}

The output will be something like this:

{"cluster_name":"app-cluster","status":"green","timed_out":false}

and the unix exit code will be 0.

Without the discard option, will the terminal output be written to raft? Or is it something else that might be written to raft?

The way I understood the consul docs was that only the exit code of a script check is considered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants