implement nudging similar to graphite. #647

Dieterbe · 2017-06-01T09:39:54Z

seems to work fine, but still have to make unit tests

Dieterbe · 2017-06-01T16:43:19Z

wanna do some more work on this first.

* move nudging-based consolidation to consolidation package this will make it easier to unit test, as opposed to having it in the plan. * since our aggregation returns points that are clean multiples of the new interval, and we don't want to move information into the past (which would be lying about the future) e.g. instead of moving a spike of data to an earlier timestamp, we move it to a later one; since we do this, we need to adjust the nudging logic to treat the point after a clean interval as the first one of any bucket.

Dieterbe · 2017-06-02T14:58:07Z

at this point, i'm pretty happy with it I think. it is simple and assures MaxDataPoints is always honored (especially important when mdp=1) (unlike the other approach i explored and described in the comments but decided not to persue cause it was too complex).

for a demo see https://vimeo.com/220016610

i start fakemetrics with secondly points, and then visualize the points with a few MDP settings.
you can see that there's never any jumping of data, except at the end, when a runtime-consolidated points is being more accurately computed as more data for it comes in. (and also some jumping when the scale is adjusted to accomodate the points which is normal)
at the front you can see that sometimes there is a gap in data that is slightly longer that the space between points (as opposed to showing an aggregated point). in the future maybe we can go more advanced, take runtime consolidation into account when fetching, fetch all data needed to go into the first point, etc, but for now I think this will do. also the size of the gap is now exaggerated because of the low MDP values / large space between points.

Dieterbe · 2017-06-19T08:54:24Z

ping @DanCech @woodsaj review plz

DanCech · 2017-06-21T14:53:55Z

consolidation/consolidate.go

+	// move start until it maps to the first point of an aggregation bucket
+	// since clean multiples of the new postAggInterval are the last point to go into an aggregation
+	// we want a point that comes preAggInterval after it.
+	remainder := (start - preAggInterval) % postAggInterval


I'm not entirely clear why we subtract preAggInterval here.

I have pushed a commit which should explain this stuff (and some of the other stuff) better. let me know what you think / if that clears it up.

DanCech · 2017-06-21T14:57:13Z

After spending some time going over this and the graphite implementation with @replay I think it makes sense, just that one question above. I'm going to dig deeper on the graphite side and figure out what the story is with its approach:

nudge = secondsPerPoint + (series.start % series.step) - (series.start % secondsPerPoint)

while it seems like it should be:

nudge = secondsPerPoint - ((series.start - 1) % secondsPerPoint) - 1

(the -1 is so that when the start time falls on the post-agg interval boundary it doesn't get nudged across a full interval)

DanCech · 2017-06-22T14:49:25Z

consolidation/consolidate.go

+	// with our nudging approach, we strip all the points that do not form a full aggregation bucket.
+	// IOW the desired first point must be the the first point of a bucket.
+	// in the example above, it means we strip the first 2 points and start at ts=25 (which will go into ts=40).
+


Interesting. This actually seems to be different from the way graphite does things (based on my reading of the code), where the timestamp of the aggregated points is the start of the interval rather than the end.

it is. it's also how we've implemented runtime consolidation in MT since forever
it's based on a guiding principle that i came up with but that is afaik in the wider graphite ecosystem not often really discussed or cared about and as a result also implemented oppositely across different tools (eg statsd also postmarks like MT). see https://grafana.com/blog/2016/03/03/25-graphite-grafana-and-statsd-gotchas/#timestamps for details
basically my point is that data should never move into the past (which breaks the space time continuum: observations can afaik not preceed the event they observe), but that people can expect data to be delayed (and move into the future instead, which is just a delay and doesn't break laws of physics) e.g. visually, a spike should never appear on a chart at an earlier time then when it actually happened, and people should expect that with consolidation, data will move it a bit to the right. so that whenever you look at a point in time (whether aggregated or not) you can always know that happened at some point in time since the previous timestamp.

perhaps this is just one of my quixotic quirks? is there a flaw in my reasoning?
especially now that we care more about stricter compatibility with graphite, do we want to undo this and do things the graphite way? though i would rather keep the postmarking, i'm open to discussion and/or change.

Not at all, I think your reasoning makes sense and it's also how things like movingAverage work in graphite, so I'm seriously considering a PR to change graphite to work according to your logic. Will need to chew on that a little.

I agree that it would make more sense to change Graphite than to change the current MT implementation (Although I wouldn't argue based on the theory of relativity).

Dieterbe · 2017-06-23T10:38:57Z

can i get an approve on this?

replay

LGTM

Dieterbe force-pushed the runtime-consolidation-nudging branch 6 times, most recently from aa05b72 to 50ce641 Compare June 1, 2017 15:56

Dieterbe requested review from woodsaj and DanCech and removed request for DanCech and woodsaj June 1, 2017 15:56

implement nudging similar to graphite.

f31ec3f

Dieterbe force-pushed the runtime-consolidation-nudging branch from 50ce641 to 7ddec47 Compare June 2, 2017 14:36

Dieterbe added 2 commits June 2, 2017 17:41

add unit tests for nudging

ffc134c

Dieterbe force-pushed the runtime-consolidation-nudging branch from 7ddec47 to ffc134c Compare June 2, 2017 14:41

Dieterbe requested review from woodsaj and DanCech June 2, 2017 15:04

DanCech reviewed Jun 21, 2017

View reviewed changes

clarify comments

6152198

DanCech reviewed Jun 22, 2017

View reviewed changes

replay approved these changes Jun 27, 2017

View reviewed changes

Dieterbe merged commit e4870e0 into master Jun 27, 2017

Dieterbe deleted the runtime-consolidation-nudging branch September 18, 2018 09:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement nudging similar to graphite. #647

implement nudging similar to graphite. #647

Dieterbe commented Jun 1, 2017

Dieterbe commented Jun 1, 2017

Dieterbe commented Jun 2, 2017 •

edited

Loading

Dieterbe commented Jun 19, 2017

DanCech Jun 21, 2017

Dieterbe Jun 22, 2017

DanCech commented Jun 21, 2017

DanCech Jun 22, 2017

Dieterbe Jun 22, 2017 •

edited

Loading

DanCech Jun 22, 2017

replay Jun 27, 2017

Dieterbe commented Jun 23, 2017

replay left a comment

implement nudging similar to graphite. #647

implement nudging similar to graphite. #647

Conversation

Dieterbe commented Jun 1, 2017

Dieterbe commented Jun 1, 2017

Dieterbe commented Jun 2, 2017 • edited Loading

Dieterbe commented Jun 19, 2017

DanCech Jun 21, 2017

Choose a reason for hiding this comment

Dieterbe Jun 22, 2017

Choose a reason for hiding this comment

DanCech commented Jun 21, 2017

DanCech Jun 22, 2017

Choose a reason for hiding this comment

Dieterbe Jun 22, 2017 • edited Loading

Choose a reason for hiding this comment

DanCech Jun 22, 2017

Choose a reason for hiding this comment

replay Jun 27, 2017

Choose a reason for hiding this comment

Dieterbe commented Jun 23, 2017

replay left a comment

Choose a reason for hiding this comment

Dieterbe commented Jun 2, 2017 •

edited

Loading

Dieterbe Jun 22, 2017 •

edited

Loading