-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for arrays being in separated rows or returned as null #213
Conversation
Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
…ues" This reverts commit cea553e. Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
…s with undesired expected results Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
@@ -85,22 +85,6 @@ public void testSelectNestedFieldItself() { | |||
); | |||
} | |||
|
|||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can fix these rather than deleting them.
verifyDataRows(response, rows(new JSONArray(List.of(new JSONObject("{\"id\":1}"), new JSONObject("{\"id\":2}")))));
verifyDataRows(response, rows(new JSONArray(List.of(1,2))));
@@ -1,9 +1,9 @@ | |||
{"index":{"_id":"1"}} | |||
{"message":[{"info":"a","author":{"name": "e", "address": {"street": "bc", "number": 1}},"dayOfWeek":1}]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are updating the test data, maybe we could update the values to be better then one or two single letters.
core/src/main/java/org/opensearch/sql/expression/ReferenceExpression.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Codecov Report
@@ Coverage Diff @@
## dev-spike-nested #213 +/- ##
======================================================
- Coverage 94.94% 94.91% -0.04%
Complexity 3472 3472
======================================================
Files 361 361
Lines 9414 9417 +3
Branches 682 682
======================================================
Hits 8938 8938
- Misses 412 414 +2
- Partials 64 65 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
ExprValue result = ExprValueUtils.collectionValue(new ArrayList<>()); | ||
|
||
for (ExprValue val: value.collectionValue()){ | ||
result.collectionValue().add(resolve(val, paths)); | ||
} | ||
return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do this with single-liner like
return new ExprCollectionValue(value.collectionValue().stream()
.map(val -> resolve(val, paths)).collect(Collectors.toList()))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Fixed in a48587d
|
||
@Test | ||
public void nested_function_with_array_of_nested_field_test() { | ||
String query = "SELECT nested(message.info), nested(comment.data) FROM " + TEST_INDEX_NESTED_TYPE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plugin response is
{
"schema": [
{
"name": "nested(message.info)",
"type": "keyword"
},
{
"name": "nested(comment.data)",
"type": "keyword"
}
],
"total": 5,
"datarows": [
[
"a",
"ab"
],
[
"b",
"aa"
],
[
"c",
"aa"
],
[
[
"c",
"a"
],
"ab"
],
[
[
"zz"
],
[
"aa",
"bb"
]
]
],
"size": 5,
"status": 200
}
It could be confusing when datatype is keyword
, but values are arrays and keywords. Imagine a user has a parser for response, what should parser do with such values?
You can try our JDBC driver as an example.
I think we should return all arrays and value type should be array
if we work with value which may be an array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe our solution is to split the nested
function for JDBC or other queries that have strict typing (really, any SQL language). For now, we should make sure that the JDBC and ODBC drivers should not output any undesired typing and break.
To ensure that typings aren't broken, we could introduce partiql like syntax for nested that would make sure that only strict typing is followed on output. For example:
SELECT n.name, m1.info AS info_keyword, m2.info AS info_array FROM nested as n, n.message AS m1 IS keyword, n.message as m2 IS array
SELECT n.name, CASE(m.info IS keyword) THEN m.info END AS info_keyword, CASE(m.info IS ARRAY) THEN m.info END AS info_array FROM nested as n, n.message AS m
For now, this can be marked as 'out of scope' as the user should be responsible for proper data setup and typing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. It is a new feature reported in #1300.
Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of CI failures, perhaps you need to rebase
I also believe you will probably have to rebase to get rid of some of the failing checks. I think you may also be failing on code coverage as well. |
@@ -10,11 +10,15 @@ | |||
|
|||
import java.util.Arrays; | |||
import java.util.List; | |||
import java.util.stream.Collectors; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this pass checkstyle?
import static org.opensearch.sql.util.MatcherUtils.rows; | ||
import static org.opensearch.sql.util.MatcherUtils.verifyDataRows; | ||
|
||
import org.json.JSONArray; | ||
import org.json.JSONObject; | ||
import org.junit.Test; | ||
import org.opensearch.sql.legacy.SQLIntegTestCase; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this space between the imports pass checkstyle?
import org.json.JSONArray; | ||
import org.json.JSONObject; | ||
import org.junit.Test; | ||
import org.opensearch.sql.legacy.SQLIntegTestCase; | ||
|
||
import java.io.IOException; | ||
import java.util.List; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these imports pass checkstyle with the extra space and them out of alphabetical order?
@@ -0,0 +1,50 @@ | |||
package org.opensearch.sql.sql; | |||
|
|||
import org.json.JSONArray; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkstyle passes?
@@ -100,6 +104,12 @@ public ExprValue resolve(ExprTupleValue value) { | |||
} | |||
|
|||
private ExprValue resolve(ExprValue value, List<String> paths) { | |||
// This case is to allow returning all values in an array to be in one row | |||
if (value.type().equals(ExprCoreType.ARRAY)){ | |||
return new ExprCollectionValue(value.collectionValue().stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like code coverage is complaining about no test for when the if
statement is true?
I think it looks good. Just want to see more checks passing (or an explanation for why they can't pass at the moment if it's failing due to something from |
The checks are failing due to some failing tests inherited from the base branch when the POC was implemented. Fixing actions and cleaning up the POC will be part of a different task. |
Closing this PR as there will be another PR towards the updated nested function POC |
Description
The main idea for the change is to support returning array values when the arrays contain objects. As a side effect, it fixed bugs with object types and partiQL syntax for the new engine.
Old behaviour of object fields with arrays of objects:
New behaviour of object fields with arrays of objects:
Old behaviour of Nested Function with arrays of objects:
Disregard the row of nulls here, it was to force the plugin to run on the legacy engine. Note the added row and duplicated "ab" value.
New behaviour of Nested Function with arrays of objects:
Old behaviour of PartiQL syntax with arrays of objects:
New behaviour of PartiQL syntax with arrays of objects:
Test data mapping for
nested_simple
:Test data for
nested_simple
andnested_as_objects
(to load object type, do not specify the mapping for the data):Issues Resolved
opensearch-project#1305
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.