-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingester bounds #3992
Ingester bounds #3992
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, small nit about atomic alignment.
It seems like more settings are being added to runtime config due to a desire to change them without restarts. Obviously not something to be solved here but, might be time to start thinking about how we could enable more configuration to be changed at runtime without needing to add it to the runtimeConfigValues
struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solid work! 👏 The overall logic makes sense to me. I've some concerns about the "global limits" naming (because we already have "global" limits) and some high level comments. I will take a deeper look at tiny details during the 2nd pass review.
1b8dc66
to
d37db98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall. Thanks for working on this.
I remembered after I sent this, I was also going to ask if you had any data on the performance difference with the new limits enabled. |
We haven't run this code in any of our environments yet. I expect that we will start testing it sometime after easter. I have done some benchmarking in this comment, but it was only comparing master vs this branch with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and thanks for the info on the benchmarking status.
71d9f41
to
865b487
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this a once over and LGTM! Thanks Peter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good job and super sorry for my late review! I left few nits. No need for me to re-review it once they're addressed. Thanks!
f317bb1
to
b7b1f8d
Compare
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Rename max_users to max_tenants. Removed extra parameter to `getOrCreateTSDB` Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
…that these limits only work when using blocks engine. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
…that these limits only work when using blocks engine. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
b7b1f8d
to
2581afa
Compare
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <pstibrany@gmail.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Thank you for all reviews! |
Also fixes #858 |
inflightRequests: promauto.With(r).NewGaugeFunc(prometheus.GaugeOpts{ | ||
Name: "cortex_ingester_inflight_push_requests", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This duplicates cortex_inflight_requests{route="/cortex.Ingester/Push"}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it does. But it exposes value use for limit check.
I tried this out, with the limit set to 500 and many synthetic requests being pushed in from Avalanche. It did, as hoped, cause a lot of errors "cannot push: too many inflight push requests in ingester". I also expected the change to cut the number of goroutines, but there were still a lot. Mostly in gRPC before the code with the limit is hit:
This in turn was caused by me using the cortex-jsonnet config which sets Reducing |
What this PR does: This PR implements various global (per-ingester, not per-tenant) limits for use by ingester:
All of these are disabled by default, and can be changed by using config/CLI parameters, and by using runtime configuration (to avoid redeploy of ingesters). Current limits are exported as
cortex_ingester_global_limit
metric with variouslimit
label. If ingester finds that push request would go over one of these limits, it returns 500 error.Which issue(s) this PR fixes:
Fixes #665.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]