diff --git a/docs/dev/sql-nested-function-select-clause.md b/docs/dev/sql-nested-function-select-clause.md index 561a93d15d..a3d1668238 100644 --- a/docs/dev/sql-nested-function-select-clause.md +++ b/docs/dev/sql-nested-function-select-clause.md @@ -81,7 +81,7 @@ Most basic example from mapping to response from SQL plugin. } ``` -A basic nested function in the SELECT clause and output DSL pushed to OpenSearch. +A basic nested function in the SELECT clause and output DSL pushed to OpenSearch. This example queries the `nested` object `message` and the inner field `info` to return all matching inner fields values. - `SELECT nested(message.info, message) FROM nested_objects;` ```json { @@ -141,7 +141,7 @@ A basic nested function in the SELECT clause and output DSL pushed to OpenSearch } ``` -Example with multiple SELECT clause function calls sharing same path. Queries sharing same path will be added to the same inner hits query for OpenSearch DSL push down. +Example with multiple SELECT clause function calls sharing same path. These two queries share the same path and will be added to the same inner hits query for pushing DSL to OpenSearch. - `SELECT nested(message.info, message), nested(message.author, message) FROM nested_objects;` ```json { @@ -204,7 +204,7 @@ Example with multiple SELECT clause function calls sharing same path. Queries sh } ``` -An example with multiple nested function calls in SELECT clause using having differing path values. A separate nested query will be created for each path used within the SQL query. +An example with multiple nested function calls in the SELECT clause having differing path values. This shows the separate nested query being created for each path used within the SQL query. - `SELECT nested(message.info, message), nested(comment.data, comment) FROM nested_objects;` ```json { @@ -298,35 +298,32 @@ An example with multiple nested function calls in SELECT clause using having dif ## 2 Architecture Diagrams ### 2.1 Composite States for Nested Query Execution -Nested function state diagram illustrating states in SQL plugin for push down execution. The nested operator stays in the `PhysicalPlan` after push down for flattening operation in post-processing. See section [2.3](#24-select-clause-nested-query-class-diagram) for flattening sequence and description. +Nested function state diagram illustrating states in SQL plugin for push down execution. The nested operator stays in the `Physical Plan Tree` after push down for flattening operation in post-processing. See section [2.3](#24-select-clause-nested-query-class-diagram) for flattening sequence and description. ```mermaid stateDiagram-v2 direction LR LogicalPlan --> OptimizedLogicalPlan: Optimize OptimizedLogicalPlan --> PhysicalPlan:push down - note right of PhysicalPlan - Note: NestedOperator stays in PhysicalPlan\nafter push down for post-processing. - end note state "Logical Plan Tree" as LogicalPlan state LogicalPlan { - logState1: Project - logState2: Nested + logState1: LogicalProject + logState2: LogicalNested logState3: ... logState1 --> logState2 logState2 --> logState3 - logState3 --> Relation + logState3 --> LogicalRelation } state "Optimized Logical Plan Tree" as OptimizedLogicalPlan state OptimizedLogicalPlan { - optState1: Project - optState2: Nested + optState1: LogicalProject + optState2: LogicalNested optState1 --> optState2 - optState2 --> IndexScanBuilder + optState2 --> OpenSearchIndexScanBuilder } state "Physical Plan Tree" as PhysicalPlan @@ -335,7 +332,7 @@ direction LR phyState2: NestedOperator phyState1 --> phyState2 - phyState2 --> IndexScan + phyState2 --> OpenSearchIndexScan } ``` @@ -385,21 +382,32 @@ QueryService-->>-SQLService:PhysicalPlan ``` ### 2.3 Sequence Diagram for Nested SELECT Clause Post-processing -Nested function sequence diagram illustrating the flattening of the OpenSearch response. Flattening the response from OpenSearch changes the nested types structure by making the full path of an object the key, and the object it refers to the value. +Nested function sequence diagram illustrating the flattening of the OpenSearch response. Flattening the response from OpenSearch changes the nested types structure by making the full path of an object the key, and the object it refers to the value. As well when a user selects multiple nested fields with differing path values, a cross join is done on the result. These examples show the flattening output keys and cross join. **Sample input:** ```json { "comments": { - "likes": 2 - } + "data": "abc" + }, + "message": [ + { "info": "letter1" }, + { "info": "letter2" } + ] } ``` **Sample Output:** ```json -{ - "comment.likes": 2 -} +[ + [ + { "comment.data": "abc" }, + { "message.info": "letter1" } + ], + [ + { "comment.data": "abc" }, + { "message.info": "letter2" } + ] +] ``` ```mermaid @@ -417,7 +425,7 @@ OpenSearchExecutionEngine->>+ProjectOperator:next ProjectOperator-->>-OpenSearchExecutionEngine:ExprValue ``` -#### 2.4 Select Clause Nested Query Class Diagram +### 2.4 Select Clause Nested Query Class Diagram Nested function class diagram for additional classes required for query execution. The `NestedAnalyzer` is a visitor for nested functions used in the SELECT clause to fulfill the `LogicalNested` LogicalPlan. After push down is successful the `NestedOperator` PhysicalPlan is used for object flattening of the OpenSearch response. ```mermaid diff --git a/docs/dev/sql-nested-function.md b/docs/dev/sql-nested-function.md index f8369dd84a..1ab9c6e436 100644 --- a/docs/dev/sql-nested-function.md +++ b/docs/dev/sql-nested-function.md @@ -1,10 +1,6 @@ ## Description -The nested function in SQL and PPL maps to the nested query DSL in the OpenSearch query engine. A nested query is used to search nested object field types in an index. If an object matches the search, the nested query returns the root parent document. Nested inner objects are returned as inner hits in the query result. - -When data is mapped as object type and stored in arrays, the inner objects are stored in a flattened form making it impossible to perform queries with isolation on the inner objects. Users may want to store their data using the `nested` object field type in order to avoid array data being stored in a flattened form and to query individual indexes of an array. Using the nested function with data stored as `nested` object field type allows users to query inner objects with isolation. - -Please refer to the documentation page for `nested` object field types for a more in-depth view of how this type works in OpenSearch. +The nested function in SQL and PPL maps to the nested query DSL in the OpenSearch query engine. A nested query is used to search nested object field types in an index. If an object matches the search, the nested query returns the root parent document. Nested inner objects are returned as inner hits in the query result. Using the nested function with data stored as `nested` object field type allows users to query inner objects with isolation. Please refer to the documentation page for `nested` object field types for a more in-depth view of how this type works in OpenSearch. [2.7 OpenSearch Nested Field Types](https://opensearch.org/docs/2.7/field-types/nested/) ## Table Of Contents @@ -23,8 +19,8 @@ Please refer to the documentation page for `nested` object field types for a mor ## 1 Overview ### 1.1 Problem Statement -**1. The V2 engine lacks legacy functionality for the nested function** - -The `nested` function is not present in the V2 engine and must be brought forward from the legacy engine to support user queries for nested object field types in SQL. +**1. The V2 engine lacks functionality to query nested object types in OpenSearch** - +The `nested` function is not present in the V2 engine and is one option for user to query nested object field types in the OpenSearch SQL plugin. **2. SQL provides a better user experience to query data than DSL** - The SQL plugin gives users the ability to interact with their data using SQL and PPL query languages rather than the OpenSearch DSL. To query `nested` object type data in SQL and PPL we need an interface that maps to the DSL `nested` queries in OpenSearch.