-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: Update resources limits for controller-manager to fix OOMKilled error #982
Comments
Hi Carlo, thanks for the question. Are you getting OOM-killed before you do anything with the operator or are you getting killed while trying to do a bunch of stuff with it? Do you have any relevant logs? |
Looks like I am able to add additional memory by editing line 921 of the
|
If you could provide us with details of anything else that you might have installed on the system, as well as what all the operator has done leading up to the OOM kill, that would be super helpful! Thanks. |
Starting logs of controller manager:
Previous last logs from restarted container
The only error i'm seeing is "ERROR zap@v1.21.0/sugar.go:173 Ignored key without a value. " but i don't know if that is related. I'm installing the Operator via OLM Subscription. I'm not using the operator.yaml Metrics of the controller manager |
I manually changed the limits in the operator yaml from the OpenShift console
and now the controller manager seems to work fine without restarting. |
Ok that's good, I'm glad it's at least not getting killed right now. I agree that that's not a good long-term solution, we will work on a better fix and keep this issue updated. |
@jooseppi-luna can you confirm if this is same as #184? |
@bharathsreekanth it's related but not the same, #184 is for adding resource limits to helm charts. These resource limits already exist in operator and are what we are adjusting here to make the deployment work. See here for where we set them in operator. |
@cassanellicarlo I spoke with @rensyct and it would help us to have these three things from you to figure this out:
|
Operator: dell-csm-operator-certified.v1.2.0 ContainerStorageModule
|
Thanks for the logs! We will investigate to see if we can replicate the issue and decide if we should bump up the limits in an upcoming release. One thing I noticed is that the health monitor sidecar is disabled, but the health monitor env var is enabled for controller and node -- is that intentional/what use case is that? |
@chimanjain @jooseppi-luna Do we have any internal ticket to track this? If so, then we need to move this query from a question to an appropriate bucket in GH. |
@jooseppi-luna any news on this? |
@cassanellicarlo sorry for the late follow up! We have increased the limits in the upcoming CSM 1.9 release (csm-operator v1.4.0). If you have any further questions or issues, please file them here and we will get to it asap. |
How can the Team help you today?
Details: ?
I'm using dell-csm-operator-certified.v1.2.0 operator on OpenShift 4.12. I installed it successfully, but the controller-manager is getting OOM-killed because it's consuming more memory than the limit set.
The default limit for the container is set to 256Mi. How can one increase it in the ContainerStorageModule resource?
The text was updated successfully, but these errors were encountered: