Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(resources): add resource limits/requests for operator deployment #423

Merged
merged 2 commits into from
Jul 5, 2022

Conversation

tthvo
Copy link
Member

@tthvo tthvo commented Jul 4, 2022

Fixes #405

Added resource limits/requests for operator deployment.

@tthvo tthvo added the fix label Jul 4, 2022
@tthvo tthvo requested review from andrewazores and ebaron July 4, 2022 21:31
@tthvo
Copy link
Member Author

tthvo commented Jul 4, 2022

Tests passed and it seems operator deployment appears stable.

NAME                                                              READY   STATUS      RESTARTS   AGE
cryostat-operator-controller-manager-5f7f467b56-fdrtl             1/1     Running     0          36m
f1a07841fb2e41a3e1ea2964f5368d1624fc76fd8fb6b7522f1d82f8c0nbbzd   0/1     Completed   0          37m
quay-io-thvo-cryostat-operator-bundle-2-0-0-dev                   1/1     Running     0          37m

Not sure if we have any other criteria for choosing values for resource limits/requests? I used values as suggested from here.

@tthvo
Copy link
Member Author

tthvo commented Jul 5, 2022

I wonder why describe on node gives request/limit of the operator as 0 (0%)? I thought it would display values as #423 (comment).

$ oc describe nodes
...(output hidden)

Namespace                                   Name                                                      CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                                   ----                                                      ------------  ----------  ---------------  -------------  ---
  default                                     cryostat-operator-controller-manager-7bdf6f95b8-hn9m7     0 (0%)        0 (0%)      0 (0%)           0 (0%)         12m
  default                                     cryostat-sample-6f4dc75cc5-2vdwn                          0 (0%)        0 (0%)      0 (0%)           0 (0%)         10m

...(output hidden)

Copy link
Member

@ebaron ebaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@andrewazores
Copy link
Member

I'd guess that just means there are no requests/limits. We were never setting one explicitly before and the crc cluster probably doesn't define any limits by default, just so that it's quickest and easiest for developers to run whatever they want on crc for development and not need to worry about tinkering with resources first.

@ebaron ebaron merged commit e919013 into cryostatio:main Jul 5, 2022
@tthvo tthvo deleted the op-deploy-limit branch July 5, 2022 19:43
@ebaron
Copy link
Member

ebaron commented Jul 5, 2022

I wonder why describe on node gives request/limit of the operator as 0 (0%)? I thought it would display values as #423 (comment).

$ oc describe nodes
...(output hidden)

Namespace                                   Name                                                      CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                                   ----                                                      ------------  ----------  ---------------  -------------  ---
  default                                     cryostat-operator-controller-manager-7bdf6f95b8-hn9m7     0 (0%)        0 (0%)      0 (0%)           0 (0%)         12m
  default                                     cryostat-sample-6f4dc75cc5-2vdwn                          0 (0%)        0 (0%)      0 (0%)           0 (0%)         10m

...(output hidden)

It's showing for me in CRC with your changes applied:

  cryostat-operator-system                    cryostat-operator-controller-manager-6f497c598b-w8rc9     100m (2%)     1 (26%)     64Mi (0%)        256Mi (1%)     5s

If you're running make deploy, this will deploy into the cryostat-operator-system namespace by default, fyi.

@tthvo
Copy link
Member Author

tthvo commented Jul 5, 2022

Oh yes thanks a lot @ebaron. Just tried with manual deployment to default namespace. It appears correctly.

default     cryostat-operator-controller-manager-57b745f5c8-wkc9v     100m (2%)     1 (26%)     64Mi (0%)        256Mi (1%)     60s

Any guidance on why there is such a difference in behavior when using bundle deployment instead (I was using Bundle deployment above)?

@ebaron
Copy link
Member

ebaron commented Jul 5, 2022


Any guidance on why there is such a difference in behavior when using bundle deployment instead (I was using Bundle deployment above)?

make deploy_bundle will install the quay.io/cryostat/cryostat-operator-bundle:2.2.0-dev operator bundle by default. You can change this with the BUNDLE_IMG variable, but this requires you to build your own bundle and push it to Quay under your namespace. This bundle image includes the Cluster Service Version for the operator, which itself includes the operator deployment manifest. You can see your changes reflected in the CSV when you ran make bundle. Running make deploy_bundle will not include any local changes you've made.

When you run make deploy, any local changes under config/ are picked up when deploying to the cluster. If you make changes to the Go sources, you'd need to build and push a custom operator image and pass that to make deploy with the OPERATOR_IMG variable.

Basically YAML definitions in config/ go in the bundle image, Go code goes in the operator image.

@tthvo
Copy link
Member Author

tthvo commented Jul 5, 2022

make deploy_bundle will install the quay.io/cryostat/cryostat-operator-bundle:2.2.0-dev operator bundle by default. You can change this with the BUNDLE_IMG variable, but this requires you to build your own bundle and push it to Quay under your namespace. This bundle image includes the Cluster Service Version for the operator, which itself includes the operator deployment manifest. You can see your changes reflected in the CSV when you ran make bundle. Running make deploy_bundle will not include any local changes you've made.

Ahhh thank you! I think I understand it more now! But when I previously tried:

$ export IMAGE_NAMESPACE=quay.io/thvo IMAGE_VERSION=2.2.0-dev
$ make generate manifests manager oci-build bundle bundle-build 
$ podman image prune -f && podman push $IMAGE_NAMESPACE/cryostat-operator:$IMAGE_VERSION && podman push $IMAGE_NAMESPACE/cryostat-operator-bundle:$IMAGE_VERSION
$ make deploy_bundle

The images got pushed to my Quay registry. And the bundle image appears to be pulled from there.

https://quay.io/repository/thvo/cryostat-operator-bundle?tab=tags

NAME                                                              READY   STATUS      RESTARTS   AGE
b3d9acc53434464a12c70ad8928b86013dfe1b5cc42962af2958e539ad4vvv8   0/1     Completed   0          6m33s
cryostat-operator-controller-manager-754976d5fd-lrcwk             1/1     Running     0          6m19s
quay-io-thvo-cryostat-operator-bundle-2-2-0-dev                   1/1     Running     0          6m49s

But the oc describe nodes still shows me 0%.

  default                                     cryostat-operator-controller-manager-754976d5fd-lrcwk     0 (0%)        0 (0%)      0 (0%)           0 (0%)         62s
  default                                     quay-io-thvo-cryostat-operator-bundle-2-2-0-dev           0 (0%)        0 (0%)      0 (0%)           0 (0%)         92s

I wonder if my steps are correct? Any ideas @ebaron?

@tthvo
Copy link
Member Author

tthvo commented Jul 5, 2022

If you make changes to the Go sources, you'd need to build and push a custom operator image and pass that to make deploy with the OPERATOR_IMG variable.

This also means I have to build the image and push it to my own Quay registry right?

@ebaron
Copy link
Member

ebaron commented Jul 6, 2022

If you make changes to the Go sources, you'd need to build and push a custom operator image and pass that to make deploy with the OPERATOR_IMG variable.

This also means I have to build the image and push it to my own Quay registry right?

That's right

@ebaron
Copy link
Member

ebaron commented Jul 6, 2022

make deploy_bundle will install the quay.io/cryostat/cryostat-operator-bundle:2.2.0-dev operator bundle by default. You can change this with the BUNDLE_IMG variable, but this requires you to build your own bundle and push it to Quay under your namespace. This bundle image includes the Cluster Service Version for the operator, which itself includes the operator deployment manifest. You can see your changes reflected in the CSV when you ran make bundle. Running make deploy_bundle will not include any local changes you've made.

Ahhh thank you! I think I understand it more now! But when I previously tried:

$ export IMAGE_NAMESPACE=quay.io/thvo IMAGE_VERSION=2.2.0-dev
$ make generate manifests manager oci-build bundle bundle-build 
$ podman image prune -f && podman push $IMAGE_NAMESPACE/cryostat-operator:$IMAGE_VERSION && podman push $IMAGE_NAMESPACE/cryostat-operator-bundle:$IMAGE_VERSION
$ make deploy_bundle

The images got pushed to my Quay registry. And the bundle image appears to be pulled from there.

https://quay.io/repository/thvo/cryostat-operator-bundle?tab=tags

NAME                                                              READY   STATUS      RESTARTS   AGE
b3d9acc53434464a12c70ad8928b86013dfe1b5cc42962af2958e539ad4vvv8   0/1     Completed   0          6m33s
cryostat-operator-controller-manager-754976d5fd-lrcwk             1/1     Running     0          6m19s
quay-io-thvo-cryostat-operator-bundle-2-2-0-dev                   1/1     Running     0          6m49s

But the oc describe nodes still shows me 0%.

  default                                     cryostat-operator-controller-manager-754976d5fd-lrcwk     0 (0%)        0 (0%)      0 (0%)           0 (0%)         62s
  default                                     quay-io-thvo-cryostat-operator-bundle-2-2-0-dev           0 (0%)        0 (0%)      0 (0%)           0 (0%)         92s

I wonder if my steps are correct? Any ideas @ebaron?

Your build looks correct to me. I pulled your bundle image and verified that the CSV within it has the resource limits/requests present. Could you check if oc describe deploy cryostat-operator-controller-manager shows the correct resource limits and requests?

@ebaron
Copy link
Member

ebaron commented Jul 6, 2022

The CSV shows the expected resources property, but the deployments do not. I'm pretty sure we're encountering this bug: operator-framework/operator-lifecycle-manager#1940, which was fixed with operator-framework/api#97. This fix is pulled into Operator SDK 1.6.0, while we're still using 1.5.0.

This bug only affects the deploy_bundle install method, so it's annoying for development, but not an issue in production. We really should upgrade the Operator SDK though: #287.

@ebaron
Copy link
Member

ebaron commented Jul 6, 2022

As a workaround, you can run oc edit sub cryostat-operator-v2-2-0-dev-sub, and delete the lines:

config:
    resources: {}

The deployment should be recreated with the resource requests and limits present.

@tthvo
Copy link
Member Author

tthvo commented Jul 7, 2022

The CSV shows the expected resources property, but the deployments do not. I'm pretty sure we're encountering this bug: operator-framework/operator-lifecycle-manager#1940, which was fixed with operator-framework/api#97. This fix is pulled into Operator SDK 1.6.0, while we're still using 1.5.0.

This bug only affects the deploy_bundle install method, so it's annoying for development, but not an issue in production. We really should upgrade the Operator SDK though: #287.

Ahh so this is the issue! I think I understood it now! Also, deleting the config field worked too! Thanks a lot :D

@ebaron
Copy link
Member

ebaron commented Jul 7, 2022

@tthvo No problem. If you're interested in tackling #287 at some point, I can help you with it.

@tthvo
Copy link
Member Author

tthvo commented Jul 7, 2022

Yes definitely @ebaron! I will start on it right away after #407 :D

@ebaron
Copy link
Member

ebaron commented Jul 7, 2022

Yes definitely @ebaron! I will start on it right away after #407 :D

Awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Define resource requests/limits for the operator deployment
3 participants