-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add ClusterResourceQuota prometheus alert for overprovisioning #1311
add ClusterResourceQuota prometheus alert for overprovisioning #1311
Conversation
fc5aa97
to
93c4f30
Compare
@umangachapagain @synarete Please review. |
93c4f30
to
7953a0d
Compare
7953a0d
to
a9a3b1c
Compare
a9a3b1c
to
cf91cfa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks sane at first glance, but I'm not an expert in any of this. Will let others LGTM.
metrics/mixin/config.libsonnet
Outdated
@@ -5,9 +5,11 @@ | |||
|
|||
// Duration to raise various Alerts | |||
clusterObjectStoreStateAlertTime: '15s', | |||
clusterResourceQuotaAlertTime: '0s', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0s feels too aggressive. Any reason why it can't be 5s or 10s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the PVC requests are absolute like 50GB or 100GB, and won't grow gradually like usage.
Also taking an example :
If the set quota is 1TB, and already provisioned is 750GB, then if at this moment, another PVC of 100 GB is requested, the total will go to 850G. Then users should immediately be informed about it because all PVC requests above the remaining 150GB will be failed immediately by the ClusterResourceQuota.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to merge but holding off until dependent PR is merged.
/hold |
Dependent PR merged. /hold cancel |
cf91cfa
to
4eb569f
Compare
This commit adds prometheus alert to notify the users when the PVC request for a particular storageclass goes beyond 80% of the limit set by the user through the ClusterResourceQuota resource. Signed-off-by: Anmol Sachan <anmol13694@gmail.com>
Signed-off-by: Anmol Sachan <anmol13694@gmail.com>
4eb569f
to
845484e
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: agarwal-mudit, umangachapagain The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Depends on #1282 . This PR is required to alert users about the thresholds being reached set via the functionality in #1282 .
Signed-off-by: Anmol Sachan anmol13694@gmail.com