Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change path-name convention shredder uses to be same as needed by filtering (no need to de-shred) #91

Closed
tatu-at-datastax opened this issue Feb 9, 2023 · 6 comments
Assignees

Comments

@tatu-at-datastax
Copy link
Contributor

Originally super-shredder path names were thought to be needed to allow "de-shredding" -- re-building of the source document from shredded paths. This is why path to value '5' in:

{ "array" : [ 5 ] }

would become array.[0], escaped so that path is never ambiguous, even if property name contained . (allowed in later Mongo versions) or opening bracket (likewise legal).

But Mongo filtering clauses use simpler, potentially ambiguous notation: array.0.

Although we could of course convert from array.0 into array.[0] it makes more sense to simplify shredded paths because:

  1. There is no need to "de-shred" (we store the full document)
  2. Avoiding conversion is not only more efficient but also removes possible error modes.
@tatu-at-datastax tatu-at-datastax self-assigned this Feb 9, 2023
@maheshrajamani
Copy link
Contributor

@tatu-at-datastax Are we going to store duplicate for arrays, in both array_equals and query_*_values?

@tatu-at-datastax
Copy link
Contributor Author

@maheshrajamani this would not change any of that but yes, contents of arrays are duplicated by Shredder (in array_equals, array_contains and then in one of query_xxx_values).

@maheshrajamani
Copy link
Contributor

Question to explore is can SAI index search of wild card for map key, if so we may not need the array_contains.
Like query_xxx_values['array.*] = 5
@amorton Can we check if this is a possibility?

@tatu-at-datastax
Copy link
Contributor Author

@maheshrajamani That would be interesting, although could be problematic with deeper nesting as this would match anything with same prefix.

@tatu-at-datastax
Copy link
Contributor Author

@maheshrajamani I think #92 is now ready: I added test that shows that basic dotted notation now works as expected for filtering:

              {
                "find": {
                  "filter" : {"array.0" : {"$eq" : "value1"}}
                }

and ditto for sub-docs, and nesting to arbitrary levels.

Handling of dotted notation is trickier for update operations and projections but filtering is easy at least.

@amorton
Copy link
Contributor

amorton commented Feb 20, 2023

Question to explore is can SAI index search of wild card for map key, if so we may not need the array_contains.
Like query_xxx_values['array.*] = 5
@amorton Can we check if this is a possibility?

@maheshrajamani It would not be possible, maybe something to add later if / when we create a full JSON type in C*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants