Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make search functions translation aware #118355

Merged
merged 8 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@ static TransportVersion def(int id) {
public static final TransportVersion NEW_REFRESH_CLUSTER_BLOCK = def(8_803_00_0);
public static final TransportVersion RETRIES_AND_OPERATIONS_IN_BLOBSTORE_STATS = def(8_804_00_0);
public static final TransportVersion ADD_DATA_STREAM_OPTIONS_TO_TEMPLATES = def(8_805_00_0);
public static final TransportVersion ESQL_QUERY_BUILDER_IN_SEARCH_FUNCTIONS = def(8_806_00_0);

/*
* STOP! READ THIS FIRST! No, really,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

package org.elasticsearch.xpack.esql.core.expression;

import org.elasticsearch.xpack.esql.core.planner.TranslatorHandler;
import org.elasticsearch.xpack.esql.core.querydsl.query.Query;

public interface TranslationAware {
ChrisHegarty marked this conversation as resolved.
Show resolved Hide resolved
Query asQuery(TranslatorHandler translatorHandler);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to centralize in this interface all the pushdown logic, including

  • capabilities, eg. can it be pushed down to Lucene at all? Maybe sometimes it depends on its inputs?
  • constraints, eg. does this function need to be pushed down to be executed? Does it also have an inline implementation that works without pushdown?

The first point is covered by PushFiltersToSource physical rule. I'm not sure if we want ad-hoc logic for each function or if we can simplify it to a single and generic behavior (ie. removing the if (exp instanceof Term). In both cases I'd rather have an exp instanceof TranslationAware there, so that one day we can also let other operators/functions implement it and remove that long list of instanceofs.

For the second point I'm thinking about the Verifier logic around FullTextFunctions. It's a complex topic that involves some deep understanding of how commands interact with each other, I put it here for completeness but probably we can address it in a follow-up PR

Copy link
Contributor Author

@ioanatia ioanatia Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for the review!

For the first point the PushFiltersToSource we need to look into it but I think we don't need the condition for Term - we removed a similar one for Match - see https://github.com/elastic/elasticsearch/pull/117555/files#r1858385620

For the second point, agreed we can do it in a follow up - I think it would be a big enough change that we would want to review separately.

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

package org.elasticsearch.xpack.esql.core.querydsl.query;

import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.xpack.esql.core.tree.Source;

public class TranslationAwareExpressionQuery extends Query {
ChrisHegarty marked this conversation as resolved.
Show resolved Hide resolved
private final QueryBuilder queryBuilder;

public TranslationAwareExpressionQuery(Source source, QueryBuilder queryBuilder) {
super(source);
this.queryBuilder = queryBuilder;
}

@Override
public QueryBuilder asBuilder() {
return queryBuilder;
}

@Override
protected String innerToString() {
return queryBuilder.toString();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -414,7 +414,7 @@ private static FunctionDefinition[][] functions() {
def(MvSum.class, MvSum::new, "mv_sum"),
def(Split.class, Split::new, "split") },
// fulltext functions
new FunctionDefinition[] { def(Match.class, Match::new, "match"), def(QueryString.class, QueryString::new, "qstr") } };
new FunctionDefinition[] { def(Match.class, bi(Match::new), "match"), def(QueryString.class, uni(QueryString::new), "qstr") } };

}

Expand All @@ -424,9 +424,9 @@ private static FunctionDefinition[][] snapshotFunctions() {
// The delay() function is for debug/snapshot environments only and should never be enabled in a non-snapshot build.
// This is an experimental function and can be removed without notice.
def(Delay.class, Delay::new, "delay"),
def(Kql.class, Kql::new, "kql"),
def(Kql.class, uni(Kql::new), "kql"),
def(Rate.class, Rate::withUnresolvedTimestamp, "rate"),
def(Term.class, Term::new, "term") } };
def(Term.class, bi(Term::new), "term") } };
}

public EsqlFunctionRegistry snapshotRegistry() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,21 @@
package org.elasticsearch.xpack.esql.expression.function.fulltext;

import org.apache.lucene.util.BytesRef;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.xpack.esql.core.expression.Expression;
import org.elasticsearch.xpack.esql.core.expression.Nullability;
import org.elasticsearch.xpack.esql.core.expression.TranslationAware;
import org.elasticsearch.xpack.esql.core.expression.TypeResolutions;
import org.elasticsearch.xpack.esql.core.expression.function.Function;
import org.elasticsearch.xpack.esql.core.planner.ExpressionTranslator;
import org.elasticsearch.xpack.esql.core.planner.TranslatorHandler;
import org.elasticsearch.xpack.esql.core.querydsl.query.Query;
import org.elasticsearch.xpack.esql.core.querydsl.query.TranslationAwareExpressionQuery;
import org.elasticsearch.xpack.esql.core.tree.Source;
import org.elasticsearch.xpack.esql.core.type.DataType;

import java.util.List;
import java.util.Objects;

import static org.elasticsearch.xpack.esql.core.expression.TypeResolutions.ParamOrdinal.DEFAULT;
import static org.elasticsearch.xpack.esql.core.expression.TypeResolutions.isNotNullAndFoldable;
Expand All @@ -26,13 +33,15 @@
* These functions needs to be pushed down to Lucene queries to be executed - there's no Evaluator for them, but depend on
* {@link org.elasticsearch.xpack.esql.optimizer.LocalPhysicalPlanOptimizer} to rewrite them into Lucene queries.
*/
public abstract class FullTextFunction extends Function {
public abstract class FullTextFunction extends Function implements TranslationAware {

private final Expression query;
private final QueryBuilder queryBuilder;

protected FullTextFunction(Source source, Expression query, List<Expression> children) {
protected FullTextFunction(Source source, Expression query, List<Expression> children, QueryBuilder queryBuilder) {
super(source, children);
this.query = query;
this.queryBuilder = queryBuilder;
ChrisHegarty marked this conversation as resolved.
Show resolved Hide resolved
}

@Override
Expand Down Expand Up @@ -116,4 +125,37 @@ public Nullability nullable() {
public String functionType() {
return "function";
}

@Override
public int hashCode() {
return Objects.hash(super.hashCode(), queryBuilder);
}

@Override
public boolean equals(Object obj) {
if (false == super.equals(obj)) {
return false;
}

return Objects.equals(queryBuilder, ((FullTextFunction) obj).queryBuilder);
}

@SuppressWarnings("rawtypes")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a @Override annotation here.

public Query asQuery(TranslatorHandler translatorHandler) {
if (queryBuilder != null) {
return new TranslationAwareExpressionQuery(source(), queryBuilder);
}

ExpressionTranslator translator = translator();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ExpressionTranslator<? extends FullTextFunction> is used here, the @SuppressWarnings("rawtypes") can go away. The same applies to the translator() of the five FullTextFunctions.

return translator.translate(this, translatorHandler);
}

public QueryBuilder queryBuilder() {
return queryBuilder;
}

@SuppressWarnings("rawtypes")
protected abstract ExpressionTranslator translator();

public abstract Expression replaceQueryBuilder(QueryBuilder queryBuilder);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not used yet - but this is what we will call for the query rewrite phase - see #118106

}
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,20 @@

package org.elasticsearch.xpack.esql.expression.function.fulltext;

import org.elasticsearch.TransportVersions;
import org.elasticsearch.common.io.stream.NamedWriteableRegistry;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.xpack.esql.core.expression.Expression;
import org.elasticsearch.xpack.esql.core.planner.ExpressionTranslator;
import org.elasticsearch.xpack.esql.core.tree.NodeInfo;
import org.elasticsearch.xpack.esql.core.tree.Source;
import org.elasticsearch.xpack.esql.expression.function.Example;
import org.elasticsearch.xpack.esql.expression.function.FunctionInfo;
import org.elasticsearch.xpack.esql.expression.function.Param;
import org.elasticsearch.xpack.esql.io.stream.PlanStreamInput;
import org.elasticsearch.xpack.esql.planner.EsqlExpressionTranslators;
import org.elasticsearch.xpack.esql.querydsl.query.KqlQuery;

import java.io.IOException;
Expand All @@ -26,7 +30,11 @@
* Full text function that performs a {@link KqlQuery} .
*/
public class Kql extends FullTextFunction {
public static final NamedWriteableRegistry.Entry ENTRY = new NamedWriteableRegistry.Entry(Expression.class, "Kql", Kql::new);
public static final NamedWriteableRegistry.Entry ENTRY = new NamedWriteableRegistry.Entry(Expression.class, "Kql", Kql::readFrom);

public Kql(Source source, Expression queryString, QueryBuilder queryBuilder) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be private? And it looks better(to me) if this is after the constructor with FunctionInfo. The same applies to the other three - Match, QueryString and Term.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made them private and then the EsqlNodeSubclassTests started failing because AFAICT they expect that the argument number from the longest public constructor to correlate with the number of args from NodeInfo.
This to me seems to be a quirk of the EsqlNodeSubclassTests rather than a test catching a real bug with FullTextFunctions

super(source, queryString, List.of(queryString), queryBuilder);
}

@FunctionInfo(
returnType = "boolean",
Expand All @@ -42,17 +50,26 @@ public Kql(
description = "Query string in KQL query string format."
) Expression queryString
) {
super(source, queryString, List.of(queryString));
super(source, queryString, List.of(queryString), null);
}

private Kql(StreamInput in) throws IOException {
this(Source.readFrom((PlanStreamInput) in), in.readNamedWriteable(Expression.class));
private static Kql readFrom(StreamInput in) throws IOException {
Source source = Source.readFrom((PlanStreamInput) in);
Expression query = in.readNamedWriteable(Expression.class);
QueryBuilder queryBuilder = null;
if (in.getTransportVersion().onOrAfter(TransportVersions.ESQL_QUERY_BUILDER_IN_SEARCH_FUNCTIONS)) {
queryBuilder = in.readOptionalNamedWriteable(QueryBuilder.class);
}
return new Kql(source, query, queryBuilder);
}

@Override
public void writeTo(StreamOutput out) throws IOException {
source().writeTo(out);
out.writeNamedWriteable(query());
if (out.getTransportVersion().onOrAfter(TransportVersions.ESQL_QUERY_BUILDER_IN_SEARCH_FUNCTIONS)) {
out.writeOptionalNamedWriteable(queryBuilder());
}
}

@Override
Expand All @@ -62,12 +79,22 @@ public String getWriteableName() {

@Override
public Expression replaceChildren(List<Expression> newChildren) {
return new Kql(source(), newChildren.get(0));
return new Kql(source(), newChildren.get(0), queryBuilder());
}

@Override
protected NodeInfo<? extends Expression> info() {
return NodeInfo.create(this, Kql::new, query());
return NodeInfo.create(this, Kql::new, query(), queryBuilder());
}

@SuppressWarnings("rawtypes")
@Override
protected ExpressionTranslator translator() {
return new EsqlExpressionTranslators.KqlFunctionTranslator();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could move the translators from EsqlExpressionTranslators but for now I just leave them as they are - I wasn't sure we need to move them. happy to hear otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference is to move them, but not in this PR - as a follow up.

}

@Override
public Expression replaceQueryBuilder(QueryBuilder queryBuilder) {
return new Kql(source(), query(), queryBuilder);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,18 @@
package org.elasticsearch.xpack.esql.expression.function.fulltext;

import org.apache.lucene.util.BytesRef;
import org.elasticsearch.TransportVersions;
import org.elasticsearch.common.io.stream.NamedWriteableRegistry;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.xpack.esql.capabilities.Validatable;
import org.elasticsearch.xpack.esql.common.Failure;
import org.elasticsearch.xpack.esql.common.Failures;
import org.elasticsearch.xpack.esql.core.expression.Expression;
import org.elasticsearch.xpack.esql.core.expression.FieldAttribute;
import org.elasticsearch.xpack.esql.core.expression.TypeResolutions;
import org.elasticsearch.xpack.esql.core.planner.ExpressionTranslator;
import org.elasticsearch.xpack.esql.core.querydsl.query.QueryStringQuery;
import org.elasticsearch.xpack.esql.core.tree.NodeInfo;
import org.elasticsearch.xpack.esql.core.tree.Source;
Expand All @@ -27,6 +30,7 @@
import org.elasticsearch.xpack.esql.expression.function.Param;
import org.elasticsearch.xpack.esql.expression.function.scalar.convert.AbstractConvertFunction;
import org.elasticsearch.xpack.esql.io.stream.PlanStreamInput;
import org.elasticsearch.xpack.esql.planner.EsqlExpressionTranslators;
import org.elasticsearch.xpack.esql.type.EsqlDataTypeConverter;

import java.io.IOException;
Expand Down Expand Up @@ -89,6 +93,11 @@ public class Match extends FullTextFunction implements Validatable {
VERSION
);

public Match(Source source, Expression field, Expression matchQuery, QueryBuilder queryBuilder) {
super(source, matchQuery, List.of(field, matchQuery), queryBuilder);
this.field = field;
}

@FunctionInfo(
returnType = "boolean",
preview = true,
Expand All @@ -109,22 +118,28 @@ public Match(
description = "Value to find in the provided field."
) Expression matchQuery
) {
super(source, matchQuery, List.of(field, matchQuery));
this.field = field;
this(source, field, matchQuery, null);
}

private static Match readFrom(StreamInput in) throws IOException {
Source source = Source.readFrom((PlanStreamInput) in);
Expression field = in.readNamedWriteable(Expression.class);
Expression query = in.readNamedWriteable(Expression.class);
return new Match(source, field, query);
QueryBuilder queryBuilder = null;
if (in.getTransportVersion().onOrAfter(TransportVersions.ESQL_QUERY_BUILDER_IN_SEARCH_FUNCTIONS)) {
queryBuilder = in.readOptionalNamedWriteable(QueryBuilder.class);
}
return new Match(source, field, query, queryBuilder);
}

@Override
public void writeTo(StreamOutput out) throws IOException {
source().writeTo(out);
out.writeNamedWriteable(field());
out.writeNamedWriteable(query());
if (out.getTransportVersion().onOrAfter(TransportVersions.ESQL_QUERY_BUILDER_IN_SEARCH_FUNCTIONS)) {
out.writeOptionalNamedWriteable(queryBuilder());
}
}

@Override
Expand Down Expand Up @@ -224,12 +239,12 @@ public Object queryAsObject() {

@Override
public Expression replaceChildren(List<Expression> newChildren) {
return new Match(source(), newChildren.get(0), newChildren.get(1));
return new Match(source(), newChildren.get(0), newChildren.get(1), queryBuilder());
}

@Override
protected NodeInfo<? extends Expression> info() {
return NodeInfo.create(this, Match::new, field, query());
return NodeInfo.create(this, Match::new, field, query(), queryBuilder());
}

protected TypeResolutions.ParamOrdinal queryParamOrdinal() {
Expand All @@ -245,6 +260,17 @@ public String functionType() {
return isOperator() ? "operator" : super.functionType();
}

@SuppressWarnings("rawtypes")
@Override
protected ExpressionTranslator translator() {
return new EsqlExpressionTranslators.MatchFunctionTranslator();
}

@Override
public Expression replaceQueryBuilder(QueryBuilder queryBuilder) {
return new Match(source(), field, query(), queryBuilder);
}

@Override
public String functionName() {
return isOperator() ? ":" : super.functionName();
Expand Down
Loading