Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Make summarize() output canonical in the !alignToFrom case #1849

Merged
merged 7 commits into from
Jun 17, 2020

Conversation

Dieterbe
Copy link
Contributor

fix #1811
fix #1845

disambiguate between on-demand runtime normalization and after-fetch
runtime normalization
...at least for a couple different functions, and also for summarize in
the !alignToFrom case.
See #1811 for more details
Before, it was common for output to contain an extraneous point that
lies before 'from' (and wouldn't be rendered by Grafana in graph
panels), which made the output invalid.

This would result in trouble when

* combining such a series with another series
  (#1811)
* or when needing normalization
  (#1845)

This effectively slightly changes the output format, but is more robust.
Copy link
Collaborator

@shanson7 shanson7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in the alignToFrom=true case? Does the panic case remain?

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Jun 17, 2020

well, alignToFrom is essentially an opt-in to turn your series into something that looks unlike anything else in metrictank (or graphite). You'll likely run into panics when trying to combine alignToFrom-summarized series with other series (as in #1811 and #1845) , which by the way, wouldn't make any sense if the timestamps aren't aligned. For this reason, I don't think it's worthwhile to spend on trying to "solve" something that doesn't seem to make any sense. If anything, I would rather discourage or deprecate use of this parameter, or at least when using this param and combining the output with other series, which triggers the problems.

Yes, the UX is not great, a clearer error would be better than a lowlevel panic (and we could do it for example by validating the plan to check for this bad usage, as explained in my comment in 1811) but so far i haven't seen any evidence of people trying to use summarize with alignToFrom and trying to combine that series with something else. So I think our engineering resources are better spent on other things.

@shanson7
Copy link
Collaborator

Fair points. Ultimately it would be better to have a clear error so the user knows how to solve it, but at this point it seems ok to.

Copy link
Collaborator

@shanson7 shanson7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Dieterbe Dieterbe merged commit 4610de9 into master Jun 17, 2020
@Dieterbe Dieterbe deleted the fix-1811-1845 branch June 17, 2020 12:21
@robert-milan
Copy link
Contributor

Overall I like this PR. It is a straight-forward and simple solution which solves the problem. I also agree with Dieter's statements about alignToFrom. Maybe we should look into deprecating it.

@shanson7
Copy link
Collaborator

shanson7 commented Jun 17, 2020

Deprecating alignToFrom? I think the use cases would need to be outlined before we consider that. There are valid cases that are difficult or impossible to do without it that would need valid replacements.

However, I do agree that normalization with other series that are not created using the same parameters does not make sense and can be an error case.

Edit:

Real quick examples:

  1. summarizing aligning to non-UTC timezones
  • e.g. summarize(..., '1d', 'avg',alignToFrom=true) where the from is aligned with a particular timezone gets the daily avg for that timezone.
  1. Performing consolidation before runtime consolidation (useful when feeding into a division).
  • e.g. If you want to guarantee that you consolidate a series to a single datapoint, then you can do summarize(..., '10y', 'sum',alignToFrom=true) (where 10y is an arbitrary long time range). Useful for tables/singlestat where you want a real average, not the average of averages.

@Dieterbe
Copy link
Contributor Author

e.g. If you want to guarantee that you consolidate a series to a single datapoint, then you can do summarize(..., '10y', 'sum',alignToFrom=true) (where 10y is an arbitrary long time range). Useful for tables/singlestat where you want a real average, not the average of averages.

oh my. talk about a hack :) I think this is more commonly achieved by consolidateBy(...,'sum')&maxDataPoints=1, which is also not very elegant.

I think ideally the render api should have a set of functions to reduce timeseries into single points

@shanson7
Copy link
Collaborator

oh my. talk about a hack :) I think this is more commonly achieved by consolidateBy(...,'sum')&maxDataPoints=1, which is also not very elegant.

No...not at all. If you want to sum the dividend and divisors before divideSeries, consolidateBy is useless as it happens too late.

@Dieterbe
Copy link
Contributor Author

Ah, that is a good point. But I still find this a contorted use of the alignToFrom and interval parameters. Not obvious to anyone but Graphite wizards.
Would rather provide first class support for reducing series to single points. This could be an interesting new project.

@shanson7
Copy link
Collaborator

Would rather provide first class support for reducing series to single points

That is exactly my point. I don't think this is a good thing, it's just as I said above:

There are valid cases that are difficult or impossible to do without it that would need valid replacements.

I'm just pointing out that we shouldn't deprecate an existing API without very good rationale and thought put into it.

This could be an interesting new project.

For sure. There are a lot of functions/behaviors of graphite that don't play nicely with each other (bound to happen with the sheer number of functions).

@Dieterbe
Copy link
Contributor Author

Yeah, all good. I think we're on the same page. 👍 the examples you listed are valid, and I appreciate you coming up with them, cause I wouldn't have thought of them myself. I wonder if the first one could also be solved using the tz parameter, but i have no time right now to look into it. It's probably a viz-only flag.

I think another valid use case for alignToFrom is:

  1. any time you run a job/task at a custom schedule and you want to align with that. (e.g. some cleanup worker, or marketing emails that go out a certain time,e tc)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic on query like asPercent(summarize(#A, '1min'), #A) Incorrect normalization causes panic
3 participants