Skip to content
This repository has been archived by the owner on Apr 2, 2024. It is now read-only.

Fix broken pushdown queries #1098

Merged
merged 1 commit into from
Feb 11, 2022
Merged

Fix broken pushdown queries #1098

merged 1 commit into from
Feb 11, 2022

Conversation

JamesGuthrie
Copy link
Contributor

@JamesGuthrie JamesGuthrie commented Feb 1, 2022

Description

Under certain circumstances, pushdown queries using promscale-extension
aggregates fail.

The failure comes from the fact that the start and end parameters passed
to the aggregate don't correctly align with the actual data which is
aggregated over by the query.

A call to the prom_rate aggregate looks something like the following:

SELECT
    prom_rate(start, end, step_size, range, sample_time, sample_value)
FROM a_metric
WHERE a_metric.t >= data_start
  AND a_metric.t <= data_end

If the time span [data_start, data_end] delivers results which are
outside of the time span [start, end], the aggregate function errors.

In general, this bug appeared when an aggregate pushdown function (such
as PromQL rate) is used together with a vector selector, such that the
lookback of the vector selector is larger than that of the aggregate
pushdown.

The fix is to build the SQL query using the correct values for
data_start and data_end.

Merge requirements

Please take into account the following non-code changes that you may need to make with your PR:

  • CHANGELOG entry for user-facing changes
  • Updated the relevant documentation

@JamesGuthrie JamesGuthrie force-pushed the jg/fix-broken-pushdown branch 4 times, most recently from 6b5cc12 to d9974ff Compare February 8, 2022 09:59
//
// Depending on the query, we may need to look further back in time than
// the `start` to deliver results from `start` onwards. For instance, a
// range query such as `rate(metric_one[5m])` in the range `(T1, T2)` will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

range queries do not have lookbacks. The range is as stated and the only thing that matters is the bucket width.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@JamesGuthrie JamesGuthrie marked this pull request as ready for review February 9, 2022 08:43
@JamesGuthrie JamesGuthrie requested a review from a team as a code owner February 9, 2022 08:43
@JamesGuthrie JamesGuthrie requested review from niksajakovljevic and removed request for a team February 9, 2022 08:43
@JamesGuthrie JamesGuthrie changed the title WIP: Fix issue with pushdown queries Fix broken pushdown queries Feb 9, 2022
@JamesGuthrie
Copy link
Contributor Author

This fixes timescale/promscale_extension#72

// The timeFilter's time range is determined by `findMinMaxTime` on the
// expression being evaluated, and is the widest span of _all_
// subexpressions (effectively [min(scan_start), end]).
// This means that, for instance, if the following vector selector and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I am getting a bit confused with wording here. From what I understand there is an instant vector selector and range vector selector. So in both cases it is vector selector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this:

Suggested change
// This means that, for instance, if the following vector selector and
// This means that, for instance, if the following range selector and

@@ -2506,6 +2506,10 @@ func TestPromQLQueryEndpoint(t *testing.T) {
name: "two pushdowns, same metric different matchers",
query: `sum(rate(metric_2{foo = "bar"}[5m]))/sum(rate(metric_2[5m]))`,
},
{
name: "one pushdown, matrix",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you maybe add some tests with offset?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific? There are already quite a few tests which use offsets in promql_query_endpoint_test.go. Is there a specific case that you would like to exercise?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking something like: rate(metric_2[1m])/metric_2 offset 2m

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

filter := metadata.timeFilter
sh := metadata.selectHints
var start, end string
if sh != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Since this only apply when selectHints are present I think it would be nice to document in what cases are they set. Changing start/end time seems scary to me so I'd like us to have a good understanding when that happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Under certain circumstances, pushdown queries using promscale-extension
aggregates fail.

The failure comes from the fact that the start and end parameters passed
to the aggregate don't correctly align with the actual data which is
aggregated over by the query.

A call to the prom_rate aggregate looks something like the following:

SELECT
    prom_rate(start, end, step_size, range, sample_time, sample_value)
FROM a_metric
WHERE a_metric.t >= data_start
  AND a_metric.t <= data_end

If the time span [data_start, data_end] delivers results which are
outside of the time span [start, end], the aggregate function errors.

In general, this bug appeared when an aggregate pushdown function (such
as PromQL rate) is used together with a vector selector, such that the
lookback of the vector selector is larger than that of the aggregate
pushdown.

The fix is to build the SQL query using the correct values for
data_start and data_end.
@JamesGuthrie
Copy link
Contributor Author

FYI I've resolved conflicts and squashed.

Copy link
Contributor

@cevian cevian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work LGTM

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants