-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement fetch new submissions for form id #33
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a some changes I suggested. I will let you review them and tell me what you think.
Also I noticed a few other things that are missing in this PR:
- A maximum number of new form submissions that can be returned for one request. This should be implemented in the DynamoDB query operation using the
Limit
option.
TheLastEvaluatedKey
value in the query operation response should also be checked as DynamoDB will use it to tell us whether we should send follow up request to get all the items we requested. See example here https://github.com/cds-snc/platform-forms-client/blob/develop/lib/vault.ts#L119
With that we would also need to communicate, through the API response, the fact that there could be more new form submissions to be retrieved. - Related to my previous point. The logic we discussed earlier this week where we would keep the
LastEvaluatedKey
in cache for when new requests are received so that we can respond with the next bunch of new form submissions that are available for retrieval - Unit tests on the
/new
router code
Co-authored-by: Clément JANIN <ninjaclem8@hotmail.fr>
@craigzour Thanks for the review! I think I've covered most of it. For the optimization on how many names we return in one call, I suggest we let the prototyping partner mess around with those endpoints before we tweak the flow. What do you think? |
About the limit on the number of new form submissions being returned by this What do you think @bryan-robitaille ? |
Got it. I'm also not a fan of introducing caching or a "stored checkpoint" before we're sure it's necessary 'There are only two hard things in Computer Science: cache invalidation...' However, I have no issue doing it if you both believe it's important for our testing partners. |
I definitely understand your point! I think it all comes down to how we want to design API v1. If we want to keep it just like that then we should make sure to explain in the documentation what will happen if you hit the endpoint and you are not planning on retrieving and confirming the responses right after. But even if this is something we specify to our users, there is still a chance you could end-up getting some of the same new form submissions because of the DynamoDB Read Consistency. We are using the default |
The more I think about it we should probably limit it to 100 at a time and always return the 100 oldest responses. This way we don't have to keep track of the I'm not overly concerned about the delay in We could also add in the documentation that there is a small chance that confirmed responses may continue to appear in the NEW endpoint for a very short period of time and Validation should be completed on their end. |
Make sense! Though, how would you return the 100 oldest responses? Would we have to query all new responses and then sort them by creation date at the API level? |
I'd have to test it out but I'm assuming the index would keep the same order as the main table since all the sort keys are identical. |
I'll give it a try. Ill create a 1000 items in local DB, and see how it goes |
from my tests and reading the SDK doc: From my understanding we will not have the same Sort Key i.e. NAME#21-53-46d2 so it will be sorted in order of UTF-8 bytes.. |
Correct! I guess the interrogation was more about what happened to that logic when you are using a Global Secondary Index (like us) that has a different set of keys (FormID + Status) where we know that the items will be ordered by Status. Is there a third criteria that DynamoDB uses to sort items beyond that? It could have been the creation date for example. If it does not work then I think we could go with just a request that retrieves 100 new form submissions (we would let DynamoDB decide what are those submissions) for the experimentation phase. |
TableName: "Vault", | ||
IndexName: "StatusCreatedAt", | ||
ExclusiveStartKey: lastEvaluatedKey ?? undefined, | ||
Limit: limit - newFormSubmissions.length, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain to me again why we would need the evaluated key to ensure the limit works properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this may not be useful just now because of the maximum number of new form submission we are requesting and the way DynamoDB handle pagination..
A single Query operation will read up to the maximum number of items set (if using the Limit parameter) or a maximum of 1 MB of data and then apply any filtering to the results using FilterExpression. If LastEvaluatedKey is present in the response, you will need to paginate the result set. For more information, see Paginating the Results
in the Amazon DynamoDB Developer Guide.
.. I wanted to have this logic built in for if something changes on their side or ours (we increase our maximum of returned new form submissions or we have to request more fields for each entry).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work!
I tested and everything seems to work as planned.
Allows a service account to download all the new submissions for one form
e.g.
GET http://FORMS_API_DOMAIN/forms/FORM_ID/submission/new
Sample response returned: