Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes for findOne command #27

Merged
merged 20 commits into from
Jan 17, 2023
Merged

Changes for findOne command #27

merged 20 commits into from
Jan 17, 2023

Conversation

maheshrajamani
Copy link
Contributor

The changes also include for find

The changes also include for find
@maheshrajamani maheshrajamani marked this pull request as ready for review January 10, 2023 19:07
@maheshrajamani maheshrajamani requested a review from a team as a code owner January 10, 2023 19:07
Copy link
Contributor

@ivansenic ivansenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having problem following this.. Can I get a guide @maheshrajamani

P.S. PR should not be this big.. It's almost impossible to review this: +2,628 −22..

Comment on lines +41 to +48
/**
* @return Defines the maximum limit of document that can be returned for a request, defaults to
* <code>1000</code>.
*/
@Max(Integer.MAX_VALUE)
@Positive
@WithDefault("1000")
int maxLimit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how's maxLimit() different from maxPageSize()? If we are storing document per row, why do we need this limit?

This was different in previous versions of docs api, as we stored documents in multiple rows, thus we need to defined limis for documents getting back and limits for page size in the executed queries.. I don't see this situation here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxPageSize is for maximum number of record that can be fetched in a query iteration, which can be paginated. maxLimit is maximum number of documents that can be delivered for a command. If a find command (yet to be implemented) reads all the document in a collection, even if it's paginated it will return only max document set as limit.

Copy link
Contributor

@ivansenic ivansenic Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so this is because of the possible in memory filtering right? 👍

But default of 1000 seems wrong, 20 is the default "page" size in many implementations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not in that context. Let say if a query is "select * from table", maxLimit defines how many records we will return irrespective of pagination. If maxLimit is 1000 the query will be run as "select * from table limit 1000" if limit value cannot be resolved from the command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha, ok, but shouldn't limit always be same as page size? 😕

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It need not be. If for example the user want to show any 200 document on the screen (may be sorted). The limit will be 200. To keep the VM in check in stargate we will have max page size as 100. The driver will paginate and return 200 records.

Comment on lines +50 to +69
int remaining = rSet.getRowsCount();
int colCount = rSet.getColumnsCount();
List<ReadDocument> documents = new ArrayList<>(remaining);
Iterator<QueryOuterClass.Row> rowIterator = rSet.getRowsList().stream().iterator();
while (--remaining >= 0 && rowIterator.hasNext()) {
QueryOuterClass.Row row = rowIterator.next();
ReadDocument document = null;
try {
document =
new ReadDocument(
Values.string(row.getValues(0)), // key
Optional.of(Values.uuid(row.getValues(1))), // tx_id
readDocument
? objectMapper.readTree(Values.string(row.getValues(2)))
: null);
} catch (JsonProcessingException e) {
throw new DocsException(ErrorCode.DOCUMENT_UNPARSEABLE);
}
documents.add(document);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want something similar as we have in the docs v3 -> io.stargate.sgv2.docsapi.service.common.model.RowWrapper

Imo this is a very nice utility that helps you row columns by name.. I see you used ids here, is this what we want to do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only max of 3 static columns read from the table as per design. I don't think we need them.

? objectMapper.readTree(Values.string(row.getValues(2)))
: null);
} catch (JsonProcessingException e) {
throw new DocsException(ErrorCode.DOCUMENT_UNPARSEABLE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this means if parsing any of the rows fails, this ends up in the exception. Is this the wanted behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

objectMapper.readTree is throwing JsonProcessingException which needs to be handled but in theory this will never happen because the json document stored is validated during insert.

@maheshrajamani
Copy link
Contributor Author

P.S. PR should not be this big.. It's almost impossible to review this:

There are 27 files out of which 10 are test class. Even with the remaining most only 8 classes has code logic in it. Not sure how to split it further.

@maheshrajamani
Copy link
Contributor Author

I am having problem following this.. Can I get a guide @maheshrajamani

Let me know if you need a code walk through on whats being done.

@maheshrajamani maheshrajamani marked this pull request as draft January 12, 2023 19:49
@maheshrajamani maheshrajamani marked this pull request as ready for review January 12, 2023 20:34
@ivansenic
Copy link
Contributor

P.S. PR should not be this big.. It's almost impossible to review this:

There are 27 files out of which 10 are test class. Even with the remaining most only 8 classes has code logic in it. Not sure how to split it further.

Not sure, but 2.6K new code lines is not something we should do in single PR in future..

Copy link
Contributor

@ivansenic ivansenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Posted some improvements.. Would be great if @amorton looks on this as well, as this is originally his code..

*/
public record FindOperation(
CommandContext commandContext,
List<DBFilterBase> filters,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I just think these should be taken outside of this class and to a separate file. They are also used outside of the class, in the resolver, so it feels better for me to have a dedicated class/package.

ErrorCode.FILTER_UNRESOLVABLE, "Options need to be returned for filterable of non findOne");
}

private Operation findDynamic(CommandContext commandContext, CaptureGroups<T> captures) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amorton @maheshrajamani This also seems implemented wrong.. Seems like you know what captures are gonna be passed to this method and thus you can go through them one by one using markers.. It's kinda defeating the whole purpose of captures..

I would propose, together with removal of the FilterMatchRules to:

  • define all the available capture groups, we have four now I would add one additional specifically for the _id, so that's handled as well
  • CaptureGroups does not have to map with a marker, but direct instances of all available groups (nullable)
  • CaptureGroups expose getters for each group, which are typed correctly Optional<CaptureGroup<String>>> getDynamicTextGroup(), etc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also not sure what's stopping us from having:

 pair ->
              filters.add(
                  new FindOperation.TextFilter(
                      pair.path(), FindOperation.MapFilterBase.Operator.EQ, pair.value())));

directly in the text capture group. If this would be the case each capture group can return list of filters. And thus CaptureGroups could even just accept list of capture groups, interate through them collect and join filters and returns this to the user of the class.. we don't even have to expose which groups we have to the outside..

Copy link
Contributor

@ivansenic ivansenic Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that as soon as you need OR, NOT and other boolean logic, capture groups would be thrown in the garbage, as they can not handle these. Maybe some part of the impl would be usable, but what we will end up having would be something like:

  • boolean logic around ComparisonExpression
  • set of rules for simplifications (for example transform NOT(NE(5) => EQ(5)
  • set of rules that convert each of the ComparisonExpression to the DBFilter or MemoryFilter

@maheshrajamani
Copy link
Contributor Author

@ivansenic As part of today's standup we discussed about changing the resolver to not use the Match Rules etc and use of library to resole the bool query. This is something we need discussion with @amorton. So suggestion was to create an issue to revisit resolver implementation once the deadline commands are finished for the Demo. Let me know if you have other concerns on the code else we will merge it for the time.

@@ -56,17 +57,105 @@ public final void createCollection() {
@Nested
class FindOne {

@BeforeEach
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd make sense to have separate test classes for Insert, FindXxx as these classes may grow quite bit; nested groups can help a bit but maybe not enough.

Copy link
Contributor Author

@maheshrajamani maheshrajamani Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this class will grow bigger with time I will split it to multiple classes. Can it wait for next PR or need it as part of this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can wait no problem.

Copy link
Contributor

@tatu-at-datastax tatu-at-datastax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok it's a big PR and I can see @ivansenic has valid concerns about FilterMatchRules (and related). But as per earlier discussions I also think that we may need to merge things as-is now and clean up at a later step, to avoid being blocked by this PR -- it is difficult to split it into smaller cleaner pieces.

So approving with expectation of issues being tackled as follow-uo steps.

@maheshrajamani
Copy link
Contributor Author

Created issue #33 and #34 to discuss and close the unresolved items. Merging this PR to proceed with other commands.

@maheshrajamani maheshrajamani merged commit 45aa7aa into main Jan 17, 2023
@maheshrajamani maheshrajamani deleted the find/resolver-operation branch January 17, 2023 20:18
@ivansenic
Copy link
Contributor

@maheshrajamani I see this is merged, but are any of my comments solved? Like that Swagger issue? Raw non parametrized types?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants