-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EQL: Introduce sequence internal paging #58859
Conversation
Refactor sequence matching classes in order to decouple querying from results consumption (and matching). Rename some classes to better convey their intent. Introduce internal pagination of sequence algorithm, that is getting the data in slices and, if needed, moving forward in order to find more matches until either the dataset is consumer or the number of results desired is found.
Pinging @elastic/es-ql (:Query Languages/EQL) |
/** | ||
* Executable tracking sequences at runtime. | ||
*/ | ||
class Matcher { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This used to be inside SequenceRuntime
but now it has been extracted and takes care only of matching.
* This allows the window to find any follow-up results even if they are found outside the initial window | ||
* of a base query. | ||
*/ | ||
public class TumblingWindow implements Executable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the querying/pagination part from SequenceRuntime; in this version it introduces two new things:
- range queries to the current window (instead of being open ended which made it hard to reason on whether a sequence should be matched or data be discarded)
- pagination/advancement. If there's no more data in the current query, keep looking in the other queries (as long as they have something to match against).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
String timestampName = Expressions.name(timestamp); | ||
String tiebreakerName = Expressions.isPresent(tiebreaker) ? Expressions.name(tiebreaker) : null; | ||
|
||
Criterion<QueryRequest> base = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one doesn't seem to actually be used.
|
||
BoxedQueryRequest boxedRequest = new BoxedQueryRequest(original, timestampName, tiebreakerName); | ||
Criterion<BoxedQueryRequest> criterion = | ||
new Criterion<>(i, boxedRequest, keyExtractors, tsExtractor, tbExtractor, i> 0 && descending); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i > 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
descending
can be false so i > 0 && descending != i > 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was about the formatting (missing space)
Refactor sequence matching classes in order to decouple querying from results consumption (and matching). Rename some classes to better convey their intent. Introduce internal pagination of sequence algorithm, that is getting the data in slices and, if needed, moving forward in order to find more matches until either the dataset is consumer or the number of results desired is found. (cherry picked from commit bcf2c11)
Refactor sequence matching classes in order to decouple querying from
results consumption (and matching).
Rename some classes to better convey their intent.
Introduce internal pagination of sequence algorithm, that is getting the
data in slices and, if needed, moving forward in order to find more
matches until either the dataset is consumer or the number of results
desired is found.