-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix(query engine): Include lines with ts equal to end timestamp of th…
…e query range when executing range aggregations (#13448) **Background** When performing range vector aggregations, such as `count_over_time({env="dev"}[1h])`, the query range is divided into multiple steps at which the aggregation operation (e.g. counting the log lines) is evaluated. Each step starts at `current step - step interval` and ends at `current step`, as depicted in the following chart. The select range for the logs is extended by the `step interval` into the past, in order to select logs for calculating the first step. ![screenshot_20240711_092352](https://github.com/grafana/loki/assets/281260/9ca6eaf5-148e-4743-aefa-6ff7071d64ad) However, the select range for logs is `start` inclusive and `end` exclusive (written as `[start, end)`), but the evaluation of the steps for the range aggregation is `start` exclusive and `end` inclusive (written as `(start, end]`). This leads to the problem that the very first timestamp at the beginning of the select range and the very last timestamp at the end of the select range are not included in the range aggregation. The "missing" last timestamp is not a problem, because a) in an instant query it is not supposed to be included anyway because of the `[start, end)` inclusivity of the query range and b) in a range query the last point of the previous step will be part of the next step evaluation. **Issue** The missing first timestamp, however, gets problematic when executing an instant query and the log timestamps are exactly at the start of the query range. This can happen when the query is split in the query frontend into multiple smaller time ranges, e.g. `1h`, `30m`, ... Since the sub queries are executed independently on the queriers, all logs that have a timestamp exactly a multiple of the split interval, e.g. 00:00, 01:00, 02:00, ... for a 1h interval, are dismissed and therefore missing in the query result over the full time range of the original query. **Fix** In order to avoid the missing logs that have a timestamp a multiple of the split interval in instant queries, we need to adjust the query range for logs to also include the `end` timestamp (written as `[start, end]`). This is done by adding a "leap nanosecond" to the `end` timestamp of the log select range. This ensures that the included `end` timestamp of the step evaluation is also included in the log selection. --- Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
- Loading branch information
Showing
5 changed files
with
34 additions
and
33 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters