-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lens] Add "counter rate" for monotonically increasing numbers #46627
Comments
Pinging @elastic/kibana-app |
@simianhacker Can you show an example of what this looks like when visualized? |
Copying over details of #58189 Kibana's basic visualizations, or Lens, should have a way to convert your data count to a rate per unit of time, e.g. (request per second). This is the usual way of thinking of metrics ("each of my instances can do 500 RPS") and it should be made easily available. As mentioned to @AlonaNadler, many metrics we look at and compare with other systems is typically expressed as a rate, e.g. "requests per second" or "clients per hour". Usually data in ES is of discrete form, you get 1 document per event (e.g. logs). How do you plot your "logs per second"? There are some answers out there in Discuss but most of them are wrong:
|
@agirbal Side note... For TSVB, If you set the interval to something like There are some other tricks you can employ like using The tricky part is abstracting all of this away from the user, using "rate" on a number that is not monotonically increasing will almost always produce something the user doesn't want, except when they know what they are doing. Until we have a concept of "number types" (ie |
@simianhacker thanks for the info! I didn't think of that trick "cumulative sum + derivative", that's a good one, is it pretty much how it could get implemented under the hood anyway? I think the feature of "sample rate" could cleanly be attached to the From there you can have another general setting It's a good idea to abstract / automate it all from the user if possible in the future, just I don't think this one is a very complex concept, would love to be able to simply do RPS in TSVB :) |
I was talking with a colleague last week about how we should just add "rate" to TSVB that essentially does |
Just created PR for adding rate to TSVB: #59843 |
@agirbal Your request might not actually be solved by this issue, but I think it is still important to track. I wrote up an issue describing what I would call Average event rate, which is different from the kind of rate that @simianhacker is describing in this request. |
Tracking this in Elasticsearch because it might be possible to get a more-correct implementation in Elasticsearch. The main reason is that counter resets in the middle of a bucket should be handled, and the client-side implementation is only able to throw it away. |
@AlonaNadler @cchaos We have been calling this the "positive rate" in TSVB, but with sentence indicating that this should only be used for monotonically increasing numbers. I would expect us to have exactly the same name and descriptive text. |
There are several decisions that we are not finalized on:
More questions will probably be raised once we get these first few. |
In terms of naming, my preference is "Positive rate" or "Counter rate", because these names clearly indicate that there is something unusual about this function. I am opposed to calling it "Rate" because it's a clearly confusing name (evidence is that we discussed the name and meaning for weeks). Using the word "growth" is also confusing because it means something different in a business context than what it means here. Can we settle on the name "Positive rate"? |
I believe we've settled on "Counter rate" after discussion. Updating the title. |
Based on a suggestion by @exekias in the parallel Elasticsearch issue, I think we should slightly tweak the algorithm that TSVB is using. The main tweak is that when the value decreases, we should use the new value instead of resetting it to zero. This is expressed in pseudocode the following way: rate = 0
loop_over_values(lambda (current, previous):
if current >= previous:
rate = rate + (current - previous)
else:
rate = rate + current
) For the types of counters that we've considered so far this algorithm is going to appear more correct by avoiding sudden drops to zero. |
Reopening as the UI part is still missing |
Closed by #84384 |
Goal statement
Lens should support the "positive growth rate" aggregation, used for showing the rate of increase for a monotonic counter such as network traffic, with time scaling in date units such as
1s
60s
, etc.Example
Visualizing network traffic per second, scaled down from 1 hour intervals. It would show
2Mb/s
.Decisions to be made
The most correct way of calculating this value is by implementing a new aggregation in Elasticsearch. Should we wait for this new aggregation, or implement a workaround in Kibana? The workaround will not produce the same numbers in all cases, because Elasticsearch is able to handle more edge cases than we can.
Decision:
For our first version we will implement the logic client side using the same approach as TSVB uses today.
UI
Rate will be a separate operation which can be chosen like sum/min/max/avg/... It allows to pick a field and a time unit.
The operation should be called "Rate". The unit always has to be picked.
As this operation requires a date histogram to make sense, we need to handle the case if there is no date histogram available (both for the case when we have a rate oepration already and if we don't):
Implementation
This behavior will be implemented as a separate Lens-private expression function which is calculating the derivative/"positive only" value based on the Elasticsearch max metric of the selected field.
Prior discussion
This is how the Infra UI calculates rates for monotonically increasing numbers like
system.network.out.bytes
for both the Inventory View and the Metrics Explorer. An added bonus is if we hadrate
as an option, the Metrics Explorer could link to Lens instead of TSVB:The text was updated successfully, but these errors were encountered: