-
Notifications
You must be signed in to change notification settings - Fork 736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: respect forefront
option in MemoryStorage
's RequestQueue
#2681
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of the pull request and linked issue has estimate
forefront
option in RQ implementationsforefront
option in MemoryStorage
's RequestQueue
7efda8b removes the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some random comments, I mostly believe that this works thanks to the tests
packages/types/src/storages.ts
Outdated
@@ -204,6 +204,7 @@ export interface RequestQueueInfo { | |||
actRunId?: string; | |||
hadMultipleClients?: boolean; | |||
stats?: RequestQueueStats; | |||
forefrontRequestIds?: string[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this one? It will only be there for local storage, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this PR is only fixing the MemoryStorage
implementation. forefrontRequestIds
tracks the ordering of the forefront requests (without the need to reimplement the way everything else is handled).
See requestKeyIterator
(used in listHead
to get the first N requests) for usage:
crawlee/packages/memory-storage/src/resource-clients/request-queue.ts
Lines 156 to 164 in 169d48b
private *requestKeyIterator(rqClient: RequestQueueClient): IterableIterator<string> { | |
for (let i = this.forefrontRequestIds.length - 1; i >= 0; i--) { | |
yield this.forefrontRequestIds[i]; | |
} | |
for (const key of rqClient.requests.keys()) { | |
yield key; | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mean... it's internal state, why do we need to expose it via getInfo()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original idea was to persist it with the other metadata, but you're right; we don't likely want to expose it.
Addressed in adce71d .
The
MemoryStorage
queue implementation doesn't respect theforefront
enqueue option. This PR fixes and tests this.This PR is necessary for solving #2669, but not sufficient - the request batch-reading in
RequestQueue.fetchNextRequest
is the other puzzle piece.