cmd/scollector: Send a hi metric with a value of 1 #1319

kylebrandt · 2015-09-15T01:32:17Z

This can be used to create an alert for scollector being alive and then used as a dependency.

gbrayut · 2015-09-15T02:48:34Z

Code looks good. Not sure what your plan is, but you might get better results using scollector.collect.queued instead. It should have the same frequency as this metric (every 15s by default), but if you get a high count you know there is more data waiting to be sent. Basically scollector.collect.queued should function exactly the same as this in the average case (only set to 0 instead of 1) and in cases where there was an outage it will actually give a better signal indicating how many data points are waiting to be sent. Also alerts on unknown values are tricky since you might just be seeing old datapoints that are being burned down in the queue (i think we are FIFO not LIFO)

Now that I think about it something like a scollector.collect.queued_since that is the seconds since the latest timestamp of the last batch of sent metric might work best, since that could easily be used in the alert logic to determine if you should go crit or just warn that scollector is back online and burning down the queue. Not sure how long the burndown time is on average, but I know if things get backed up significantly it can take a bit to return to 0.

kylebrandt · 2015-09-15T02:54:28Z

The timestamp of the value 1 should be when the collector created it. The thing I like about the way I did it is it is very straightforward in what it does. Using scollector.collect.queued as an additional (or additional logic) check makes sense to me though, as does adding the scollector.collect.queued_since metric (not sure entirely what you mean there though as to which timestamp).

My plan is this can be a dependency, and if the host can be pinged but this isn't there, it isn't host down but rather (scollector down, not current, etc).

cmd/scollector: Send a hi metric with a value of 1

cmd/scollector: Send a hi metric with a value of 1

34a380b

This can be used to create an alert for scollector being alive and then used as a dependency.

kylebrandt added a commit that referenced this pull request Sep 15, 2015

Merge pull request #1319 from bosun-monitor/sHi

b525d33

cmd/scollector: Send a hi metric with a value of 1

kylebrandt merged commit b525d33 into master Sep 15, 2015

kylebrandt deleted the sHi branch September 15, 2015 19:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/scollector: Send a hi metric with a value of 1 #1319

cmd/scollector: Send a hi metric with a value of 1 #1319

kylebrandt commented Sep 15, 2015

gbrayut commented Sep 15, 2015

kylebrandt commented Sep 15, 2015

cmd/scollector: Send a hi metric with a value of 1 #1319

cmd/scollector: Send a hi metric with a value of 1 #1319

Conversation

kylebrandt commented Sep 15, 2015

gbrayut commented Sep 15, 2015

kylebrandt commented Sep 15, 2015