-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Filtering over nested arrays of scalars #182
[RFC] Filtering over nested arrays of scalars #182
Conversation
abb7334
to
e63929c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I appreciate the new comparison target idea since that makes disambiguating easier for the connector. I agree that this should only apply to arrays of scalars.
There is a problem with this approach. It doesn't allow use of logical operators beyond the new ArrayComparison::ExistsInNestedArray boundary. So, for example, if the following data existed:
and I wanted to ask the following question:
query {
Customer(where: { nested_numbers: { inner: { _and: [ { _eq: 1 }, { _eq: 2 } ] } } }) {
id
}
} I couldn't because there's no way to nest logical operators inside a ArrayComparison::ExistsInNestedArray. collection: customers
query:
fields:
id:
type: column
column: id
predicate:
type: array_comparison
column:
type: column
name: nested_numbers
path: []
comparison:
type: exists_in_nested_array
nested_comparison:
type: and # This does not exist
comparisons:
- type: exists_binary
operator: eq
value:
type: literal
value: 1
- type: exists_binary
operator: eq
value:
type: literal
value: 2
arguments: {}
collection_relationships: {} Perhaps a simple solution to this, that doesn't require another level of "And", "Or", "Not" inside enum ArrayComparison {
...
ExistsInNestedArray {
nested_comparisons: Vec<ArrayComparison>
}
} This would allow us to do logical conjunction (ie "and") within the same exists scope. For disjunction, my understanding is that we can always lift the logical "or" to the top of the expression. Here's me asking ChatGPT about it: https://chatgpt.com/share/66fa38a6-c7a4-8000-a290-6af0aaa258a5 (its answer about "Existential Quantification and Conjunction (∃ and ∧)" is revealing) |
Only if you want to perform DNF translation or something else whenever an OR moves past an AND. |
I had a thought - we had said that it would be helpful if pub enum Expression {
// ...
ExistsInArray {
column: ComparisonTarget,
predicate: Option<Box<Expression>>,
},
}
pub enum ComparisonTarget {
// ...
ContextuallyRelevantValue, // unit variant
} In that formulation given the GraphQL query, query {
Customer(where: { nested_numbers: { inner: { _and: [ { _eq: 1 }, { _eq: 2 } ] } } }) {
id
}
} I'm imagining an NDC query like this, collection: customers
query:
fields:
id:
type: column
column: id
predicate:
type: exists_in_array
column:
type: column
name: nested_numbers
path: []
predicate:
type: exists_in_array
column: { type: "contextually_relevant_value" }
predicate:
type: and
expressions:
- type: binary_comparison_operator
column: { type: "contextually_relevant_value" }
operator: eq
value:
type: literal
value: 1
- type: binary_comparison_operator
column: { type: "contextually_relevant_value" }
operator: eq
value:
type: literal
value: 2
arguments: {}
collection_relationships: {} But @daniel-chambers I think I disagree with how the expressions apply in that query. I think that GraphQL query asks if there exists one element that is equal to 1 and also equal to 2 so I expect that query wouldn't return any rows. To filter where the inner array contains possibly-different elements where one is equal to 1 and one is equal to 2 I think the query {
Customer(where: { nested_numbers: { _and: [{ inner: { _eq: 1 } }, { inner: { _eq: 2 } }] } }) {
id
}
} But either way I see your point that we need a change to allow boolean conjunctions. So my idea might lead to some possible nonsensical expressions. But we can avoid problems by not producing such expressions. I think it would be better to change the |
Along the same lines as pub enum ExistsInCollection {
Related {..},
Unrelated {..},
NestedCollection {..},
NestedScalarCollection {
column_name: FieldName,
arguments: BTreeMap<ArgumentName, Argument>,
field_path: Vec<FieldName>,
}
|
@paf31 Does Edit: I guess why wouldn't those cases work? Sounds good. |
EDIT: Actually, I think you're right. Sorry!! I think to do what I wanted above you can't use two nested exists, you need an exists and contains (eg I think @paf31's latest idea seems reasonable... it certainly follows an existing pattern we use for functions (a virtual column). I think I prefer it to extending I'll update the RFC with the latest idea. |
What we have lost, however, is the I've updated the RFC with these ideas. |
6d90ace
to
36c3e5a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Do we have an idea what GraphQL expressions that map to ArrayComparisons will look like? Like will contains
and is_empty
be reserved operator names that supersede scalar comparisons?
@hallettj Check out the v3-engine RFC for more information about graphql shape. As for reserved names, how they are named in graphql is up to how you configure the naming in OpenDD. If the connector defines both and their names clash, you will need to rename one or the other. |
9a5a95f
to
b4b06fa
Compare
b4b06fa
to
3eff113
Compare
78ee304
to
4787e85
Compare
|
||
```rust,no_run,noplayground | ||
{{#include ../../../../../ndc-reference/bin/reference/main.rs:eval_expression_exists}} | ||
``` | ||
|
||
Note in particular, we push the current row onto the stack of `scopes` before executing the inner query, so that [references to columns in those scopes](../../../specification/queries/filtering.md#referencing-a-column-from-a-collection-in-scope) can be resolved correctly. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use subheadings here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done :)
@@ -0,0 +1,280 @@ | |||
/* Globals */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deliberate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I widened the width of content in the ndc-spec documentation website by 150px to help give a bit more space to accommodate the capabilities tables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good thanks.
Just so I understand the overlap (contains and is_empty can be expressed using exists-in-scalar-array), I assume this is meant to future-proof things for connectors which can implement one but not the other? Do we have such a connector in mind or is it speculative?
@paf31 ClickHouse was the connector I had in mind, because I remember @BenoitRanque saying that arbitrary existential quantification across nested arrays wasn't going to be possible. |
50711f4
to
01a8f11
Compare
ad09a0b
to
685841d
Compare
Note: This PR is currently stacked on #187 and will be rebased once that PR has merged.Rebased on main.RFC Rendered
This PR also makes some minor quality of life improvements:
reference/types.md
ndc-test
if appropriatespecification/capabilities.md
in the specificationndc-test