-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Rollup] Log deprecation if config and request intervals are mixed #33284
[Rollup] Log deprecation if config and request intervals are mixed #33284
Conversation
Backport of elastic#32052, with some changes. In 6.5, we forbid mixed intervals (one fixed, one calendar). But for a 6.4.x bugfix release this would be a pretty extreme break. So we continue to allow mixed intervals in 6.4, but log a deprecation warning. 6.4 also doesn't enforce that the request is a multiple of the config. We also log a deprecation if that is not the case, but don't prevent the query. Finally, mixed intervals are only used if a better job cap can't be found. To support the case where we have to compare fixed and calendar, there is a table of rough conversions (e.g. 1 day == 86400000ms). This lets us crudely ensure the user doesn't try to query `2d` on an index that is set for `1w`.
Pinging @elastic/es-search-aggs |
intervals, the <<rollup-search>> API can aggregate on any time interval hourly or greater. Intervals that are less than an hour will throw | ||
an exception, since the data simply doesn't exist for finer granularities. | ||
Rollups are stored at a certain granularity, as defined by the `date_histogram` group in the configuration. This means you | ||
can only search/aggregate the rollup data with an interval that is greater-than or equal to the configured rollup interval. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't technically true, but I'm inclined to leave as-is to encourage proper config in 6.4 :)
A non-multiple wouldn't work, since the rolled up data wouldn't cleanly "overlap" with the buckets generated | ||
by the aggregation, leading to incorrect results. | ||
|
||
For that reason, an error is thrown if a whole multiple of the configured interval isn't found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
Note: I also spotted a bug in the comparator, similar to the parsing bug that's being fixed. But I'd like to fix that in a followup PR because it applies to all branches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good @polyfractal , I left some comments
*/ | ||
static boolean validateMixedInterval(long requestInterval, DateHistogramInterval configInterval) { | ||
if (isCalendarInterval(configInterval)) { | ||
long configIntervalMillis = CALENDAR_ORDERING.getOrDefault(configInterval.toString(), Long.MAX_VALUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to log the deprecation warning if the statement below is true ?
// TODO should be divisor of interval | ||
if (interval <= source.interval()) { | ||
// query interval must be gte the configured interval, and a whole multiple | ||
if (interval <= source.interval() && source.interval() % interval == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also turn this breaking change in a deprecation warning ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argh, yes. adding thanks
// The request must be gte the config. The CALENDAR_ORDERING map values are integers representing | ||
// relative orders between the calendar units | ||
long requestOrder = CALENDAR_ORDERING.getOrDefault(requestInterval.toString(), Long.MAX_VALUE); | ||
long configOrder = CALENDAR_ORDERING.getOrDefault(configInterval.toString(), Long.MAX_VALUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can maybe replace the static map with:
DateTimeUnit configUnit = DateHistogramAggregationBuilder.DATE_FIELD_UNITS.get(configInterval.toString());
long configOrder = configUnit.field(DateTimeZone.UTC).getDurationField().getUnitMillis();
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't realize you could do this. ++
Tidied up, and added a few more tests for the multiples warning |
Backport of #32052, with some changes.
In 6.5, we forbid mixed intervals (one fixed, one calendar). But for a 6.4.x bugfix release this would be a pretty extreme break. So we continue to allow mixed intervals in 6.4, but log a deprecation warning.
6.4 also doesn't enforce that the request is a multiple of the config. So we log a deprecation if that is not the case, but don't prevent the query.
Finally, mixed intervals are only used if a better job cap can't be found.
To support the case where we have to compare fixed and calendar, there is a table of rough conversions (e.g. 1 day == 86400000ms). This lets us crudely ensure the user doesn't try to query
2d
on an index that is set for1w
.@jimczi Mind giving a quick review on this? The major change from the other PR is the addition of
validateMixedInterval()
, changes to theCALENDAR_ORDERING
table, and that we don't enforce multiples.