feat: Proper working cursor #2514

another-rex · 2024-08-23T02:39:38Z

The current paging/cursor implementation is slightly hacky and fails at various edge cases, and for many query types does not work at all, e.g. (queries without ecosystem, semver queries...).

When initially implemented, the assumption was almost all API calls resulted in one datastore query. This is no longer the case.

This refactor/rework adds:

New cursor type
- which can represent 3 states (Start, End, InProgress) in a type safe way (through enums, thanks for the ideas @michaelkedar @DonggeLiu!)
- Also stores the ndb.Cursor inside
- Also stores which query number the ndb.Cursor is for when there are multiple queries
- can serialise to a page token, and deserialise in a backward compatible fashion.
Reworked logic to centralise cursor handling to make the code more readable and predicable:
- Cursor state is stored in context, rather than passed through a tuple through return types
- Query skips are determined next to each query.iter() function, where the datastore queries actually happens, rather than in outer functions.
Context now accounts for the number of datastore queries that has been performed
- This is used to know which query a cursor belongs to
Minor refactors to query_generic_helper() to remove unnecessary arguments being passed in
Add a skipIf on env variable to allow actually running the pagination tests locally.
Modify the tests to run faster (by lowering the sleep duration haha),
Modify tests to take in unittest arguments of which tests to run from the second argument onwards (rather than having the gcloud service account always being the last argument)

oliverchang

nice!

gcp/api/cursor.py

andrewpollock

This is awesome

gcp/api/cursor.py

gcp/api/server.py

oliverchang · 2024-08-27T00:39:39Z

gcp/api/cursor.py

+        # a token in the response
+        return None
+
+    if self.query_number == 0:


When is this ever true, given https://github.com/google/osv.dev/pull/2514/files#r1730697012 ?

Ah that should be 1

Actually removed this entirely, there is no reason to not return this as part of the page token.

oliverchang

LGTM with some minor remaining comments. Awesome stuff!

michaelkedar · 2024-08-27T01:07:06Z

gcp/api/cursor.py

+_METADATA_SEPARATOR = ':'
+
+
+class _QueryCursorState(Enum):


nit: Since it's the only thing that uses it , this could possibly go inside of QueryCursor:

class QueryCursor: class _State(Enum): ENDED = 0 # ... _cursor_state: _State = _State.ENDED @property def ended(self) -> bool: return self._cursor_state == QueryCursor._State.ENDED

That said, I don't think I have a preference either way.

I think it's just a bit clearer to be private but at the top level to avoid having to write the class name repeatedly.

another-rex added 3 commits August 23, 2024 12:16

feat: Proper working cursor

cbfa609

Add cursor file

2930805

Fix documentation and remove initializer

da61019

another-rex requested review from oliverchang, andrewpollock, michaelkedar, hogo6002 and cuixq August 23, 2024 02:39

another-rex added 4 commits August 23, 2024 12:43

Perform formatting and linting

84dc430

Run formatter on the file I forgot about

911ae9c

Final format on server.py hopefully

9e875cf

Merge branch 'master' into the-cursor-rework

c7dc5ce

oliverchang reviewed Aug 26, 2024

View reviewed changes

andrewpollock reviewed Aug 26, 2024

View reviewed changes

another-rex added 6 commits August 26, 2024 15:45

Minor fix to import types

921dc1f

PR comments first pass

43fab25

Add better docstring for QueryContext

17a4305

Add a lot of documentation

85400f5

Add missing docstring

2719d3b

Add documentation to query_by_comparing_versions

4afea58

another-rex requested review from andrewpollock and oliverchang August 27, 2024 00:29

hogo6002 reviewed Aug 27, 2024

View reviewed changes

gcp/api/server.py Outdated Show resolved Hide resolved

oliverchang reviewed Aug 27, 2024

View reviewed changes

oliverchang approved these changes Aug 27, 2024

View reviewed changes

michaelkedar reviewed Aug 27, 2024

View reviewed changes

another-rex added 4 commits August 27, 2024 15:13

Add cursor at current function

94d0515

Add comment as to why choose that specific version

876a42f

Make linter happy

66a2aa7

Linter still not happy

a744fde

another-rex merged commit 12f33e6 into google:master Aug 28, 2024
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Proper working cursor #2514

feat: Proper working cursor #2514

another-rex commented Aug 23, 2024

oliverchang left a comment

andrewpollock left a comment

oliverchang Aug 27, 2024

another-rex Aug 27, 2024

another-rex Aug 27, 2024

oliverchang left a comment

michaelkedar Aug 27, 2024

another-rex Aug 27, 2024

feat: Proper working cursor #2514

feat: Proper working cursor #2514

Conversation

another-rex commented Aug 23, 2024

oliverchang left a comment

Choose a reason for hiding this comment

andrewpollock left a comment

Choose a reason for hiding this comment

oliverchang Aug 27, 2024

Choose a reason for hiding this comment

another-rex Aug 27, 2024

Choose a reason for hiding this comment

another-rex Aug 27, 2024

Choose a reason for hiding this comment

oliverchang left a comment

Choose a reason for hiding this comment

michaelkedar Aug 27, 2024

Choose a reason for hiding this comment

another-rex Aug 27, 2024

Choose a reason for hiding this comment