Re-enqueue 429 requests if there are multiple query-schedulers #5496

damnever · 2023-08-03T06:56:14Z

What this PR does:

The current implementation cannot ensure that one tenant's requests are evenly distributed among all schedulers. As a result, some scheduler queues may be full, while others still have some room left.

This PR attempts to retry 429 requests randomly if there are multiple query-schedulers. We may need a more sophisticated approach to effectively address this issue.

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

yeya24 · 2023-08-03T08:01:40Z

pkg/frontend/v2/frontend.go

@@ -192,6 +194,8 @@ func (f *Frontend) RoundTripGRPC(ctx context.Context, req *httpgrpc.HTTPRequest)
 		// even if this goroutine goes away due to client context cancellation.
 		enqueue:  make(chan enqueueResult, 1),
 		response: make(chan *frontendv2pb.QueryResultRequest, 1),
+
+		retryOnTooManyOutstandingRequests: f.schedulerWorkers.getWorkersCount() > 0,


If multiple query schedulers have queues full, then retry may make queueing even worse? A request will retry forever until it is not discarded by the queue.

I think the retry can still be useful. Let's add a feature flag?

retries := f.cfg.WorkerConcurrency + 1 limits the maximum number of retries.

A feature flag sounds good.

yeya24 · 2023-08-03T17:00:57Z

@damnever Can you fix DCO?

@damnever

Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>

@damnever

Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>

yeya24

Thanks!

@damnever

…xproject#5496) * Re-enqueue 429 requests if there are multiple query-schedulers Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com> * Add feature flag Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com> --------- Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>

pull-request-size bot added the size/S label Aug 3, 2023

damnever force-pushed the fix/retry-429 branch from 9c088f5 to 5b01e76 Compare August 3, 2023 06:56

yeya24 reviewed Aug 3, 2023

View reviewed changes

pull-request-size bot added size/M and removed size/S labels Aug 3, 2023

Re-enqueue 429 requests if there are multiple query-schedulers

80c05d0

Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>

damnever force-pushed the fix/retry-429 branch from 67cfc06 to b447c1a Compare August 4, 2023 02:36

Add feature flag

979ff3e

Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>

damnever force-pushed the fix/retry-429 branch from b447c1a to 979ff3e Compare August 4, 2023 03:24

yeya24 approved these changes Aug 4, 2023

View reviewed changes

yeya24 approved these changes Aug 9, 2023

View reviewed changes

yeya24 merged commit f791f5b into cortexproject:master Aug 9, 2023

damnever deleted the fix/retry-429 branch August 10, 2023 03:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-enqueue 429 requests if there are multiple query-schedulers #5496

Re-enqueue 429 requests if there are multiple query-schedulers #5496

damnever commented Aug 3, 2023 •

edited

Loading

yeya24 Aug 3, 2023

yeya24 Aug 3, 2023

damnever Aug 3, 2023

yeya24 commented Aug 3, 2023

yeya24 left a comment

Re-enqueue 429 requests if there are multiple query-schedulers #5496

Re-enqueue 429 requests if there are multiple query-schedulers #5496

Conversation

damnever commented Aug 3, 2023 • edited Loading

yeya24 Aug 3, 2023

Choose a reason for hiding this comment

yeya24 Aug 3, 2023

Choose a reason for hiding this comment

damnever Aug 3, 2023

Choose a reason for hiding this comment

yeya24 commented Aug 3, 2023

yeya24 left a comment

Choose a reason for hiding this comment

damnever commented Aug 3, 2023 •

edited

Loading