Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odata opaque cursor #934

Merged
merged 10 commits into from
Sep 12, 2023
Merged

Conversation

sadiqkhoja
Copy link
Contributor

@sadiqkhoja sadiqkhoja commented Jul 19, 2023

Closes #665

What has been done to verify that this works as intended?

Integration tests

Why is this the best possible solution? Were any other approaches considered?

  • Base64 of a json object is used for skipToken. Fields of the json object depends on the odata table. In case of Submission, it is instanceId, for Submission subtable (repeat) it is repeatId (hash of the data - existing approach) and for Entities it is UUID
  • I didn't compress the token, base64 encoding fulfills the requirement
  • We are still fetching all the Submissions when data for the subtable is requested because we need to count the total repeating rows and return nextlink conditionally i.e. only when there are more rows to be returned. So pagination for subtable is not much of a help, it only saves bandwidth between server and client. I wish we could get rid of those two requirements then odata feed for subtable would be efficient as well.

Caveat
Since I am using hash ID of the repeat row and if the data of that repeat item is changed then subsequent nextlink will return nothing

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

None

Does this change require updates to the API documentation? If so, please update docs/api.md as part of this PR.

TODO

Before submitting this PR, please make sure you have:
[ ] run make test-full and confirmed all checks still pass OR confirm CircleCI build passes
[ ] verified that any code from external sources are properly credited in comments or that everything is internally sourced

@sadiqkhoja sadiqkhoja requested a review from matthew-white July 19, 2023 20:58
const tokenData = {
instanceId: body.value[body.value.length - 1].instanceId,
}
const token = btoa(JSON.stringify(tokenData));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btoa() wouldn't work if the instance ID contained some Unicode characters. But we don't see such instance IDs in practice, so I don't think we should worry about it.

const tokenData = {
instanceId: body.value[body.value.length - 1].instanceId,
}
const token = btoa(JSON.stringify(tokenData));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about prefixing this string with a version number in case we ever want to change the implementation of the cursor? That would give us an easy way to identify new- vs. old-format cursors. I'm thinking we could just prefix it with 01 for "version 01". I can't think of why we would change the implementation, but it seems appealing to give ourselves the flexibility to do so more easily. No strong feelings!

.expect(200)
.then(({ body }) => {
const tokenData = {
instanceId: body.value[body.value.length - 1].instanceId,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strategy of using instanceId + __id makes sense to me. 👍 How long will this token be? Should we try to compress the data like we do with app user QR codes? (See CollectQr in Frontend.)

@sadiqkhoja sadiqkhoja force-pushed the features/odata_opaque_cursor branch from 22bfe1b to d3a678c Compare July 27, 2023 20:08
@sadiqkhoja sadiqkhoja marked this pull request as ready for review July 31, 2023 17:56
@sadiqkhoja sadiqkhoja requested review from matthew-white and removed request for matthew-white July 31, 2023 17:57
@sadiqkhoja sadiqkhoja marked this pull request as draft July 31, 2023 20:05
Comment on lines 450 to 451
const skipTokenData = { instanceId: lastInstanceId };
if (isSubTable) skipTokenData.repeatId = lastRepeatId;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking to use just repeatId for subtable and remove instanceId

@sadiqkhoja sadiqkhoja marked this pull request as ready for review August 1, 2023 16:14
@matthew-white
Copy link
Member

Since I am using hash ID of the repeat row and if the data of that repeat item is changed then subsequent nextlink will return nothing

Could you say more about this? What data contributes to the hash ID? Is it all fields of the repeat group, such that any edit to the repeat group could change its ID? Or is it just the path (including index) of the repeat group?

@sadiqkhoja
Copy link
Contributor Author

Could you say more about this? What data contributes to the hash ID? Is it all fields of the repeat group, such that any edit to the repeat group could change its ID? Or is it just the path (including index) of the repeat group?

It's an existing function lib/data/odata.js:hashId. The hash is calculated based on instanceId of the Submission and the fullpath of the repeat item like Submissions#double%%children#%%child#1.

I was wrong when I said "data of the repeat item" can change the hash, it is the index that can affect the hash.

Here is an example where there will be a problem:

  • Submission has 10 repeat items
  • User requests $top=5
  • User deletes 6 repeat items
  • Next odata request will return nothing because hashId will not match

Similar problem can arise if user manually deletes Submission because skipToken is based on instanceId and if instanceId is not found then cursor is pointing nowhere. However, probability of this happening for Submissions is quite low because we don't provide any way to delete individual Submissions.

In both cases, I don't see this being a big problem, just wanted to make note of it.

@sadiqkhoja sadiqkhoja force-pushed the features/odata_opaque_cursor branch from 7b59340 to 9e03d7a Compare September 1, 2023 21:43
@matthew-white
Copy link
Member

I ended up prioritizing reviewing getodk/central-frontend#851, and I ran out of time to review this one. Sorry about that! I think the only thought I want to offer is that I think it's OK if $skiptoken isn't 100% resilient to concurrent modification. As long as it returns an error when there's a problem and doesn't return the wrong data or incorrectly return no data, I think it's OK.

@sadiqkhoja sadiqkhoja requested a review from ktuite September 6, 2023 17:00
test/integration/api/odata-entities.js Show resolved Hide resolved
test/integration/api/odata-entities.js Show resolved Hide resolved
test/integration/api/odata-entities.js Outdated Show resolved Hide resolved
test/integration/api/odata.js Outdated Show resolved Hide resolved
test/integration/api/odata.js Outdated Show resolved Hide resolved
test/integration/api/odata.js Outdated Show resolved Hide resolved
test/integration/api/odata.js Outdated Show resolved Hide resolved
test/integration/api/odata.js Outdated Show resolved Hide resolved
test/integration/api/odata.js Show resolved Hide resolved
@sadiqkhoja sadiqkhoja merged commit 211dd59 into getodk:master Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add opaque cursor-based pagination
3 participants