-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[consul] Added maintenance status for metrics #2496
Conversation
@brumschlag, thank you very much for your contribution. It's very much appreciated! Unfortunately, we're not going to be able to add any more new features before the next release. It takes too long to test all of them, especially when we're in the middle of moving to a new model for how we package and distribute checks. This feature makes sense, however, and I think we'd happily add it into the check afterwards. |
Currently, if a node in Consul is set into maintenance state, it is reported as a node in critical state to Datadog metrics. This makes it confusing to determine by the metric whether it is a node failing with problems or an planned intervention. A user made a PR to Datadog some time ago, but it was not merged due to Datadog code organization changes and got forgotten (DataDog/dd-agent#2496). I'm pushing the change forward. I've tested it using dd-agent version 5.8.0. This PR also depends on DataDog/dd-agent#3708
Currently, if a node in Consul is set into maintenance state, it is reported as a node in critical state to Datadog metrics. This makes it confusing to determine by the metric whether it is a node failing with problems or an planned intervention. A user made a PR to Datadog some time ago, but it was not merged due to Datadog code organization changes and got forgotten (DataDog/dd-agent#2496). I'm pushing the change forward. I've tested it using dd-agent version 5.8.0. This PR also depends on DataDog/dd-agent#3708
Currently, if a node in Consul is set into maintenance state, it is reported as a node in critical state to Datadog metrics. This makes it confusing to determine by the metric whether it is a node failing with problems or an planned intervention. A user made a PR to Datadog some time ago, but it was not merged due to Datadog code organization changes and got forgotten (DataDog/dd-agent#2496). I'm pushing the change forward. I've tested it using dd-agent version 5.8.0. This PR also depends on DataDog/dd-agent#3708
Currently, if a node in Consul is set into maintenance state, it is reported as a node in critical state to Datadog metrics. This makes it confusing to determine by the metric whether it is a node failing with problems or an planned intervention. A user made a PR to Datadog some time ago, but it was not merged due to Datadog code organization changes and got forgotten (DataDog/dd-agent#2496). I'm pushing the change forward. I've tested it using dd-agent version 5.8.0. This PR also depends on DataDog/dd-agent#3708
Currently, if a node in Consul is set into maintenance state, it is reported as a node in critical state to Datadog metrics. This makes it confusing to determine by the metric whether it is a node failing with problems or an planned intervention. A user made a PR to Datadog some time ago, but it was not merged due to Datadog code organization changes and got forgotten (DataDog/dd-agent#2496). I'm pushing the change forward. I've tested it using dd-agent version 5.8.0. This PR also depends on DataDog/dd-agent#3708
closing as superseded by DataDog/integrations-core#1267 |
When a node is in 'maintenance' mode, it is a different measure than if a service is critical.
This PR adds a 'maintenance' status -- it checks for a check named _node_maintenance. If it is critical, it marks the node and services as 'maintenance' status instead of critical.
This is helpful for cases when new nodes come up in a maintenance mode, ready to be turned on.
Also useful for alerting only on 'critical' cases - not 'maintenance'.