-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Nested Function Support In SELECT Clause #1490
Add Nested Function Support In SELECT Clause #1490
Conversation
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #1490 +/- ##
============================================
- Coverage 98.49% 97.18% -1.31%
- Complexity 3928 4100 +172
============================================
Files 347 371 +24
Lines 9771 10448 +677
Branches 645 703 +58
============================================
+ Hits 9624 10154 +530
- Misses 142 287 +145
- Partials 5 7 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 27 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
sql/src/main/java/org/opensearch/sql/sql/parser/AstExpressionBuilder.java
Outdated
Show resolved
Hide resolved
for (UnresolvedExpression expr : node.getProjectList()) { | ||
NestedAnalyzer nestedAnalyzer = new NestedAnalyzer( | ||
namedExpressions, expressionAnalyzer, child | ||
); | ||
child = nestedAnalyzer.analyze(expr, context); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we generate a single LogicalNested
here rather than get them merged later in MergeNestedAndNested
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update to create the single LogicalNested here rather than merging further down query execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'm just thinking if we have all the project items and always merge multiple LogicalNested later, we may be able to save that optimizer rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked the latest revision and wondering if the for loop is needed? Because we generate single nested operator with all namedExpressions
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the for loop is still needed. The namedExpressions
are passed in each time for project push down, but we are fulfilling the LogicalNested
nested fields by each nested
Function
in the projectList
. Both namedExpressions
and nested functions are required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using for-loop inside NesteAnalyzer, and change interface?
child = nestedAnalyzer.analyze(node.getProjectList(), context);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be my preference to keep the for-loop as is so we don't have to complicate adding additional arguments to an already formed LogicalPlan
. If we move the for-loop into the NestedAnalyzer
we will have to do some not-so-nice logic in NestedAnalyzer:Analyze to aggregate the arguments from the analyzed LogicalPlan
's.
NestedAnalyzer.java#L36-L74
If you would prefer this implementation I can make the necessary revisions!
core/src/main/java/org/opensearch/sql/expression/nested/NestedFunction.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/planner/physical/UnnestOperator.java
Outdated
Show resolved
Hide resolved
Meanwhile could you double check if the logical and physical operator is generic and extensible enough:
|
e8114bd
to
8526ef1
Compare
I think the physical operator should work well with exposing functionality in PPL. The functionality is generic and should be able to handle Struct and Arrays fields from other datasources. More investigation will be needed later but the code is generic that porting shouldn't be too difficult. |
core/src/main/java/org/opensearch/sql/planner/logical/LogicalNested.java
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/planner/logical/LogicalPlanNodeVisitor.java
Outdated
Show resolved
Hide resolved
opensearch/src/main/java/org/opensearch/sql/opensearch/response/OpenSearchResponse.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/expression/function/OpenSearchFunctions.java
Show resolved
Hide resolved
for (UnresolvedExpression expr : node.getProjectList()) { | ||
NestedAnalyzer nestedAnalyzer = new NestedAnalyzer( | ||
namedExpressions, expressionAnalyzer, child | ||
); | ||
child = nestedAnalyzer.analyze(expr, context); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked the latest revision and wondering if the for loop is needed? Because we generate single nested operator with all namedExpressions
right?
core/src/main/java/org/opensearch/sql/planner/physical/UnnestOperator.java
Outdated
Show resolved
Hide resolved
docs/user/dql/functions.rst
Outdated
``nested(field | [field, path])`` | ||
|
||
The ``nested`` function maps to the ``nested`` query used in search engine. It returns nested field types in documents that match the provided specified field(s). | ||
If the user does not provide the ``path`` parameter it will be generated dynamically. The ``field`` ``user.office.cubicle`` would dynamically generate the path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is user.office.cubicle? is it an example sub field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is just an example field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see, i feel it miss context info, could you add, for example, The field
user.office.cubicle
would dynamically generate the path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've updated the comment to be more descriptive.
for (UnresolvedExpression expr : node.getProjectList()) { | ||
NestedAnalyzer nestedAnalyzer = new NestedAnalyzer( | ||
namedExpressions, expressionAnalyzer, child | ||
); | ||
child = nestedAnalyzer.analyze(expr, context); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using for-loop inside NesteAnalyzer, and change interface?
child = nestedAnalyzer.analyze(node.getProjectList(), context);
core/src/main/java/org/opensearch/sql/planner/logical/LogicalNested.java
Show resolved
Hide resolved
core/src/test/java/org/opensearch/sql/planner/physical/NestedOperatorTest.java
Show resolved
Hide resolved
import org.junit.jupiter.api.Disabled; | ||
import org.opensearch.sql.legacy.SQLIntegTestCase; | ||
|
||
public class NestedIT extends SQLIntegTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Are nested functions limited to being used only in the select clause, or can they also be used in the where or group by clause? what is the expecuted result if nested function been used in where / group by clause?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is only to support nested in the SELECT
clause. Follow up PR's will be made for GROUP BY
and WHERE
clauses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe: rename to NestedInSelectIT
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-1490-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 fbc72a41ef66d33c4650171260ddf8c8702903ca
# Push it to GitHub
git push --set-upstream origin backport/backport-1490-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.7 2.7
# Navigate to the new working tree
cd .worktrees/backport-2.7
# Create a new branch
git switch --create backport/backport-1490-to-2.7
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 fbc72a41ef66d33c4650171260ddf8c8702903ca
# Push it to GitHub
git push --set-upstream origin backport/backport-1490-to-2.7
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.7 Then, create a pull request where the |
Users can now query nested fields in an index. --------- Signed-off-by: forestmvey <forestv@bitquilltech.com> (cherry picked from commit fbc72a4)
Users can now query nested fields in an index. --------- Signed-off-by: forestmvey <forestv@bitquilltech.com> (cherry picked from commit fbc72a4) Signed-off-by: forestmvey <forestv@bitquilltech.com>
…roject#1490) Users can now query nested fields in an index. --------- Signed-off-by: forestmvey <forestv@bitquilltech.com>
Users can now query nested fields in an index. --------- Signed-off-by: forestmvey <forestv@bitquilltech.com> (cherry picked from commit fbc72a4)
Description
Syntax:
nested( [field] | [field,path] )
Nested function allows the query of nested types of an index. The
path
parameter is determined dynamically if not provided by the user, and the output query should be identical assuming the user input the correct parameter values. Thecondition
parameter is not supported when the nested function is used in theSELECT
clause. Nested types structure is flattened making the full path of an object the key, and the object it refers to the value. Wildcard support will be added in a follow up PR along with Support for push down with relevance based search functions.Example Queries
Simple nested query, array of
message.info
is flattened into two rows.SELECT nested(message.info) FROM nested_objects;
After flattening notice
comment.data
has repeatingab
SELECT nested(message.info), nested(comment.data) FROM nested_objects;
someField is of type object
SELECT nested(message.info), someField FROM nested_objects;
To Do
Changes from Legacy Functionality
condition
parameter in SELECT clause.partiql.rst
to handle object arrays same as V2 engine.FIELD
as the column identifier. V2 engine uses function name.Issues Resolved
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.