-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-enqueue 429 requests if there are multiple query-schedulers #5496
Conversation
pkg/frontend/v2/frontend.go
Outdated
@@ -192,6 +194,8 @@ func (f *Frontend) RoundTripGRPC(ctx context.Context, req *httpgrpc.HTTPRequest) | |||
// even if this goroutine goes away due to client context cancellation. | |||
enqueue: make(chan enqueueResult, 1), | |||
response: make(chan *frontendv2pb.QueryResultRequest, 1), | |||
|
|||
retryOnTooManyOutstandingRequests: f.schedulerWorkers.getWorkersCount() > 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If multiple query schedulers have queues full, then retry may make queueing even worse? A request will retry forever until it is not discarded by the queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the retry can still be useful. Let's add a feature flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retries := f.cfg.WorkerConcurrency + 1
limits the maximum number of retries.
A feature flag sounds good.
@damnever Can you fix DCO? |
Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…xproject#5496) * Re-enqueue 429 requests if there are multiple query-schedulers Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com> * Add feature flag Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com> --------- Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
What this PR does:
The current implementation cannot ensure that one tenant's requests are evenly distributed among all schedulers. As a result, some scheduler queues may be full, while others still have some room left.
This PR attempts to retry 429 requests randomly if there are multiple query-schedulers. We may need a more sophisticated approach to effectively address this issue.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]