Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bruop error when trying to upgrade nodes #179

Closed
pmacieje opened this issue Apr 4, 2022 · 12 comments
Closed

Bruop error when trying to upgrade nodes #179

pmacieje opened this issue Apr 4, 2022 · 12 comments

Comments

@pmacieje
Copy link

pmacieje commented Apr 4, 2022

**Image I'm using:**bottlerocket/bottlerocket-update-operator:v0.2.0

Issue or Feature Request: I've implementet bruop on EKS cluster, but there only one (of six) node was updated. Rest of nodes got error like below.

Agent logs:
{"v":0,"name":"agent","msg":"[UPDATE_BOTTLEROCKET_SHADOW - START]","level":30,"hostname":"brupop-agent-4z2dr","pid":1,"time":"2022-04-04T07:50:55.176421185+00:00","target":"apiserver::client::webclient","line":155,"file":"apiserver/src/client/webclient.rs","self":"K8SAPIServerClient { k8s_projected_token_path: "/var/run/secrets/tokens/bottlerocket-agent-service-account-token" }","req":"UpdateBottlerocketShadowRequest { node_selector: BottlerocketShadowSelector { node_name: "ip-172-22-49-202.aws-ec2", node_uid: "1c491162-0d6f-4517-a594-e2ab72f4bc14" }, node_status: BottlerocketShadowStatus { current_version: "1.6.2", target_version: "1.6.2", current_state: Idle } }"}
{"v":0,"name":"agent","msg":"[UPDATE_BOTTLEROCKET_SHADOW - END]","level":30,"hostname":"brupop-agent-4z2dr","pid":1,"time":"2022-04-04T07:51:12.631267091+00:00","target":"apiserver::client::webclient","line":155,"file":"apiserver/src/client/webclient.rs","elapsed_milliseconds":17454,"self":"K8SAPIServerClient { k8s_projected_token_path: "/var/run/secrets/tokens/bottlerocket-agent-service-account-token" }","req":"UpdateBottlerocketShadowRequest { node_selector: BottlerocketShadowSelector { node_name: "ip-172-22-49-202.aws-ec2", node_uid: "1c491162-0d6f-4517-a594-e2ab72f4bc14" }, node_status: BottlerocketShadowStatus { current_version: "1.6.2", target_version: "1.6.2", current_state: Idle } }"}
{"v":0,"name":"agent","msg":"[UPDATE_METADATA_CUSTOM_RESOURCE - EVENT] agent::agentclient","level":50,"hostname":"brupop-agent-4z2dr","pid":1,"time":"2022-04-04T07:51:12.631362459+00:00","target":"agent::agentclient","line":206,"file":"agent/src/agentclient.rs","error":"Unable to update the custom resource associated with this node: 'Unable to update BottlerocketShadow status (ip-172-22-49-202.aws-ec2, 1c491162-0d6f-4517-a594-e2ab72f4bc14): 'API server responded with an error status code 500 Internal Server Error: 'Error patching BottlerocketShadow: 'Unable to update BottlerocketShadow status (ip-172-22-49-202.aws-ec2, 1c491162-0d6f-4517-a594-e2ab72f4bc14): 'ApiError: BottlerocketShadow.brupop.bottlerocket.aws "brs-ip-172-22-49-202.aws-ec2" is invalid: status.crash_count: Required value: Invalid (ErrorResponse { status: "Failure", message: "BottlerocketShadow.brupop.bottlerocket.aws \"brs-ip-172-22-49-202.aws-ec2\" is invalid: status.crash_count: Required value", reason: "Invalid", code: 422 })'''''"}

Monitoring Custom Resources:
$ kubectl get brs --namespace brupop-bottlerocket-aws
NAME STATE VERSION TARGET STATE TARGET VERSION CRASH COUNT
brs-ip-172-22-40-11.aws-ec2 Idle
brs-ip-172-22-43-11.aws-ec2 Idle
brs-ip-172-22-45-41.aws-ec2 Idle
brs-ip-172-22-45-87.aws-ec2 Idle
brs-ip-172-22-49-202.aws-ec2 Idle
brs-ip-172-22-50-13.aws-ec2 Idle

Bruop stack deployed:
kubectl get pods -n brupop-bottlerocket-aws
NAME READY STATUS RESTARTS AGE
brupop-agent-4z2dr 1/1 Running 0 3d21h
brupop-agent-dfztv 1/1 Running 0 3d21h
brupop-agent-hzd7d 1/1 Running 0 3d21h
brupop-agent-lz9xq 1/1 Running 0 3d21h
brupop-agent-vb2w8 1/1 Running 0 3d21h
brupop-agent-zc6rt 1/1 Running 1 3d21h
brupop-apiserver-745c5cffd9-bdts7 1/1 Running 1 3d21h
brupop-apiserver-745c5cffd9-ssvbx 1/1 Running 0 3d21h
brupop-apiserver-745c5cffd9-wht6g 1/1 Running 0 3d21h
brupop-controller-deployment-8545559bc7-pqkrs 1/1 Running 1 3d21h

All nodes are labeled with bottlerocket.aws/updater-interface-version=2.0.0

Could enyone help with this issue ?

@cbgbt
Copy link
Contributor

cbgbt commented Apr 4, 2022

Thanks for opening this issue. I believe this is an issue with the installation instructions: the instructions use the .yaml file from our develop branch, which has had a few changes to it since; however, the images we've deployed are still 0.2.0. We're waiting for a few more features to land before cutting a new release.

To fix this, we need to lock the installation instructions to the latest version.

You can fix this in your cluster by clearing your resources in the brupop namespace, and re-installing those resources from the 0.2.0 tag for the time being:
https://github.com/bottlerocket-os/bottlerocket-update-operator/tree/v0.2.0

I'll have a stab at correcting the instructions.

@cbgbt
Copy link
Contributor

cbgbt commented Apr 4, 2022

#180 has fixed the install instructions to use the versioned artifacts. Would you mind checking to see if this resolves your issue?

To fix this in a better way long-term, we have #126, as well as #161 which would make newer CRDs for the shadow objects backwards compatible. We're intending to use the solution in #161 to fix the current compatibility issue in develop with 0.2.0 before rolling out the next version.

@pmacieje
Copy link
Author

pmacieje commented Apr 5, 2022

Looks much better:

$kubectl get brs --namespace brupop-bottlerocket-aws
NAME STATE VERSION TARGET STATE TARGET VERSION
brs-ip-172-22-43-11.aws-ec2 Idle 1.6.2 Idle
brs-ip-172-22-45-41 Idle 1.6.2 Idle
brs-ip-172-22-45-87 Idle 1.6.2 Idle
brs-ip-172-22-49-202 Idle 1.6.2 StagedUpdate 1.7.0
brs-ip-172-22-50-13 Idle 1.7.0 Idle

But there is still some problem with upgrade:

Logs from agent of 172-22-49-202 node:

{"v":0,"name":"agent","msg":"[RUN - EVENT] Detected drift between spec state and current state. Requesting node to take action","level":30,"hostname":"brupop-agent-fxq6d","pid":1,"time":"2022-04-05T07:57:55.244687734+00:00","target":"agent::agentclient","line":322,"file":"agent/src/agentclient.rs","action":"StagedUpdate","brs_name":"Some("brs-ip-172-22-49-202.aws-ec2")"}
{"v":0,"name":"agent","msg":"[RUN - EVENT] Preparing update","level":30,"hostname":"brupop-agent-fxq6d","pid":1,"time":"2022-04-05T07:57:55.244705606+00:00","target":"agent::agentclient","line":337,"file":"agent/src/agentclient.rs"}
{"v":0,"name":"agent","msg":"[RUN - EVENT] API server busy, retrying later ...","level":30,"hostname":"brupop-agent-fxq6d","pid":1,"time":"2022-04-05T07:57:55.255168626+00:00","target":"agent::apiclient","line":140,"file":"agent/src/apiclient.rs"}
{"v":0,"name":"agent","msg":"[RUN - EVENT] agent::agentclient","level":50,"hostname":"brupop-agent-fxq6d","pid":1,"time":"2022-04-05T07:58:05.264579472+00:00","target":"agent::agentclient","line":261,"file":"agent/src/agentclient.rs","error":"Unable to take action 'Prepare': 'Unexpected update state: Ready, expecting state to be Available or Staged. Update action performed out of band?'"}
{"v":0,"name":"agent","msg":"[RUN - END]","level":30,"hostname":"brupop-agent-fxq6d","pid":1,"time":"2022-04-05T07:58:05.264648063+00:00","target":"agent::agentclient","line":260,"file":"agent/src/agentclient.rs","elapsed_milliseconds":10039}

@cbgbt cbgbt closed this as completed Apr 5, 2022
@cbgbt cbgbt reopened this Apr 5, 2022
@cbgbt
Copy link
Contributor

cbgbt commented Apr 5, 2022

I misread your message, apologies for closing.

So the error message is:

{"v":0,"name":"agent","msg":"[RUN - EVENT] agent::agentclient","level":50,"hostname":"brupop-agent-fxq6d","pid":1,"time":"2022-04-05T07:58:05.264579472+00:00","target":"agent::agentclient","line":261,"file":"agent/src/agentclient.rs","error":"Unable to take action 'Prepare': 'Unexpected update state: Ready, expecting state to be Available or Staged. Update action performed out of band?'"}

Which is being thrown here.

Brupop attempts to "prepare" an update image, and then writes back to the k8s API that the preparation is ready. If, during preparation, it notices an image is already prepped, then it raises an error -- the rationale being that a user may be attempting to manually install a different update ("out of band"), and Brupop should not intervene.

I'm noticing the IP of the affected host is the same -- so I think what has happened is:

  • The previously-installed Brupop readied a new image
  • The previously-installed Brupop attempted to write back to the k8s API that a new image was ready, but this write failed due to the initial issue that you hit with the CRD schema.
  • After re-installing Brupop, the preparation was attempted again, but the image that was readied last time is now triggering the logic I mentioned above, causing it to refuse to proceed.

If you have admin access to this node, then manually triggering the update this once should "un-stick" Brupop. From a usability perspective, I think it might make sense for Brupop to attempt to perform it's configured update regardless. Admins trying to perform an out-of-band update should probably disable Brupop on the node first. We can look at changing this behavior to prevent this error case.

@cbgbt
Copy link
Contributor

cbgbt commented Apr 5, 2022

To apply the update I'm mentioning on this node, you'd want to follow these steps.

I would suggest using kubectl drain to remove the node from service first, then using kubectl uncordon to bring it back after the update is finished.

@cbgbt cbgbt self-assigned this Apr 5, 2022
@cbgbt cbgbt added this to the brupop 0.2.1 milestone Apr 5, 2022
@pmacieje
Copy link
Author

pmacieje commented Apr 8, 2022

Thanks for suggestion. I've updated manually problematic host. Bow it's another problem ... Bruop try to update next host, state is StagedUpdate, host is cordoned but it's not drained ans stand in this state for last 24h ...

$ kubectl get brs --namespace brupop-bottlerocket-aws
NAME STATE VERSION TARGET STATE TARGET VERSION
brs-ip-172-22-40-11 Idle 1.6.2 Idle
brs-ip-172-22-42-116 Idle 1.5.2 Idle
brs-ip-172-22-45-41 Idle 1.6.2 StagedUpdate 1.7.0
brs-ip-172-22-45-87 Idle 1.6.2 Idle
brs-ip-172-22-47-39 Idle 1.5.2 Idle
brs-ip-172-22-49-202 Idle 1.7.0 Idle
brs-ip-172-22-50-13 Idle 1.7.0 Idle

Agent logs from host 172-22-45-41:

{"v":0,"name":"agent","msg":"[FETCH_CUSTOM_RESOURCE - START]","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:34:33.367421300+00:00","target":"agent::agentclient","line":140,"file":"agent/src/agentclient.rs"}
{"v":0,"name":"agent","msg":"[FETCH_CUSTOM_RESOURCE - END]","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:34:33.367455131+00:00","target":"agent::agentclient","line":140,"file":"agent/src/agentclient.rs","elapsed_milliseconds":0}
{"v":0,"name":"agent","msg":"[CHECK_CUSTOM_RESOURCE_STATUS_EXISTS - END]","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:34:33.367467999+00:00","target":"agent::agentclient","line":108,"file":"agent/src/agentclient.rs","elapsed_milliseconds":0}
{"v":0,"name":"agent","msg":"[FETCH_CUSTOM_RESOURCE - START]","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:34:33.367479359+00:00","target":"agent::agentclient","line":140,"file":"agent/src/agentclient.rs"}
{"v":0,"name":"agent","msg":"[FETCH_CUSTOM_RESOURCE - END]","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:34:33.367514056+00:00","target":"agent::agentclient","line":140,"file":"agent/src/agentclient.rs","elapsed_milliseconds":0}
{"v":0,"name":"agent","msg":"[RUN - EVENT] Detected drift between spec state and current state. Requesting node to take action","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:34:33.367538347+00:00","target":"agent::agentclient","line":322,"file":"agent/src/agentclient.rs","action":"StagedUpdate","brs_name":"Some("brs-ip-172-22-45-41.")"}
{"v":0,"name":"agent","msg":"[RUN - EVENT] Preparing update","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:34:33.367550257+00:00","target":"agent::agentclient","line":337,"file":"agent/src/agentclient.rs"}
{"v":0,"name":"agent","msg":"[RUN - EVENT] API server busy, retrying later ...","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:34:33.385496908+00:00","target":"agent::apiclient","line":140,"file":"agent/src/apiclient.rs"}
{"v":0,"name":"agent","msg":"[RUN - EVENT] API server busy, retrying later ...","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:34:43.411666641+00:00","target":"agent::apiclient","line":140,"file":"agent/src/apiclient.rs"}
{"v":0,"name":"agent","msg":"[RUN - EVENT] API server busy, retrying later ...","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:34:53.430894722+00:00","target":"agent::apiclient","line":140,"file":"agent/src/apiclient.rs"}
{"v":0,"name":"agent","msg":"[CORDON_AND_DRAIN - START]","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:35:03.437700594+00:00","target":"agent::agentclient","line":232,"file":"agent/src/agentclient.rs"}
{"v":0,"name":"agent","msg":"[GET_NODE_SELECTOR - START]","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:35:03.437928758+00:00","target":"agent::agentclient","line":113,"file":"agent/src/agentclient.rs"}
{"v":0,"name":"agent","msg":"[GET_NODE_SELECTOR - END]","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:35:03.438329067+00:00","target":"agent::agentclient","line":113,"file":"agent/src/agentclient.rs","elapsed_milliseconds":0}
{"v":0,"name":"agent","msg":"[CORDON_AND_DRAIN_NODE - START]","level":30,"hostname":"brupop-agent-bqxz6","pid":1,"time":"2022-04-07T20:35:03.438483325+00:00","target":"apiserver::client::webclient","line":209,"file":"apiserver/src/client/webclient.rs","self":"K8SAPIServerClient { k8s_projected_token_path: "/var/run/secrets/tokens/bottlerocket-agent-service-account-token" }","req":"CordonAndDrainBottlerocketShadowRequest { node_selector: BottlerocketShadowSelector { node_name: "ip-172-22-45-41", node_uid: "85bee1f8-badd-48b6-98e5-d147a05050a7" }

API server logs:

{"v":0,"name":"apiserver","msg":"[CHECK_REQUEST_AUTHORIZED - START]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:43:21.322756382+00:00","target":"apiserver::auth::authorizor","line":49,"file":"apiserver/src/auth/authorizor.rs","http.target":"/bottlerocket-node-resource","node_selector":"BottlerocketShadowSelector { node_name: "ip-172-22-49-202", node_uid: "1c491162-0d6f-4517-a594-e2ab72f4bc14" }","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.scheme":"http","http.flavor":"1.1","otel.kind":"server","node_name":"ip-172-22-49-202","http.method":"PUT","http.client_ip":"172.22.48.174:41666","request_id":"2a66662c-f3a5-4aea-b5f1-28026675fc6c","http.user_agent":"","http.route":"/bottlerocket-node-resource"}
{"v":0,"name":"apiserver","msg":"[CHECK_REQUEST_AUTHORIZED - END]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:43:21.327224621+00:00","target":"apiserver::auth::authorizor","line":49,"file":"apiserver/src/auth/authorizor.rs","http.target":"/bottlerocket-node-resource","node_selector":"BottlerocketShadowSelector { node_name: "ip-172-22-49-202", node_uid: "1c491162-0d6f-4517-a594-e2ab72f4bc14" }","elapsed_milliseconds":4,"http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.scheme":"http","http.flavor":"1.1","otel.kind":"server","node_name":"ip-172-22-49-202","http.method":"PUT","http.client_ip":"172.22.48.174:41666","request_id":"2a66662c-f3a5-4aea-b5f1-28026675fc6c","http.user_agent":"","http.route":"/bottlerocket-node-resource"}
{"v":0,"name":"apiserver","msg":"[UPDATE_NODE_STATUS - START]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:43:21.327567693+00:00","target":"models::node::client","line":207,"file":"models/src/node/client.rs","http.target":"/bottlerocket-node-resource","status":"BottlerocketShadowStatus { current_version: "1.7.0", target_version: "1.7.0", current_state: Idle }","selector":"BottlerocketShadowSelector { node_name: "ip-172-22-49-202", node_uid: "1c491162-0d6f-4517-a594-e2ab72f4bc14" }","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.scheme":"http","http.flavor":"1.1","otel.kind":"server","node_name":"ip-172-22-49-202.","http.method":"PUT","http.client_ip":"172.22.48.174:41666","request_id":"2a66662c-f3a5-4aea-b5f1-28026675fc6c","http.user_agent":"","http.route":"/bottlerocket-node-resource"}
{"v":0,"name":"apiserver","msg":"[UPDATE_NODE_STATUS - END]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:43:21.336126204+00:00","target":"models::node::client","line":207,"file":"models/src/node/client.rs","http.target":"/bottlerocket-node-resource","elapsed_milliseconds":8,"status":"BottlerocketShadowStatus { current_version: "1.7.0", target_version: "1.7.0", current_state: Idle }","selector":"BottlerocketShadowSelector { node_name: "ip-172-22-49-202.", node_uid: "1c491162-0d6f-4517-a594-e2ab72f4bc14" }","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.scheme":"http","http.flavor":"1.1","otel.kind":"server","node_name":"ip-172-22-49-202.","http.method":"PUT","http.client_ip":"172.22.48.174:41666","request_id":"2a66662c-f3a5-4aea-b5f1-28026675fc6c","http.user_agent":"","http.route":"/bottlerocket-node-resource"}
{"v":0,"name":"apiserver","msg":"[HTTP REQUEST - END]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:43:21.336262365+00:00","target":"apiserver::telemetry","line":40,"file":"apiserver/src/telemetry.rs","http.target":"/bottlerocket-node-resource","elapsed_milliseconds":13,"http.status_code":200,"otel.status_code":"OK","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.scheme":"http","http.flavor":"1.1","otel.kind":"server","node_name":"ip-172-22-49-202","http.method":"PUT","http.client_ip":"172.22.48.174:41666","request_id":"2a66662c-f3a5-4aea-b5f1-28026675fc6c","http.user_agent":"","http.route":"/bottlerocket-node-resource"}
{"v":0,"name":"apiserver","msg":"[HTTP REQUEST - START]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:14.270670529+00:00","target":"apiserver::telemetry","line":40,"file":"apiserver/src/telemetry.rs","http.route":"/bottlerocket-node-resource","http.client_ip":"172.22.45.3:35296","otel.kind":"server","http.method":"POST","http.user_agent":"","http.scheme":"http","http.target":"/bottlerocket-node-resource","node_name":"ip-172-22-45-41","http.flavor":"1.1","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","request_id"ame":"ip-172-22-45-41:"6ccffab1-2fc3-4862-b98e-af0e81cc58ae"}
{"v":0,"name":"apiserver","msg":"[CHECK_REQUEST_AUTHORIZED - START]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:14.270785915+00:00","target":"apiserver::auth::authorizor","line":49,"file":"apiserver/src/auth/authorizor.rs","http.route":"/bottlerocket-node-resource","http.client_ip":"172.22.45.3:35296","otel.kind":"server","http.method":"POST","http.user_agent":"","http.scheme":"http","http.target":"/bottlerocket-node-resource","node_name":"ip-172-22-45-41","http.flavor":"1.1","node_selector":"BottlerocketShadowSelector { node_name: "ip-172-22-45-41.", node_uid: "85bee1f8-badd-48b6-98e5-d147a05050a7" }","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","request_id":"6ccffab1-2fc3-4862-b98e-af0e81cc58ae"}
{"v":0,"name":"apiserver","msg":"[CHECK_REQUEST_AUTHORIZED - END]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:14.284414437+00:00","target":"apiserver::auth::authorizor","line":49,"file":"apiserver/src/auth/authorizor.rs","http.route":"/bottlerocket-node-resource","http.client_ip":"172.22.45.3:35296","otel.kind":"server","http.method":"POST","http.user_agent":"","http.scheme":"http","http.target":"/bottlerocket-node-resource","node_name":"ip-172-22-45-41.","elapsed_milliseconds":13,"http.flavor":"1.1","node_selector":"BottlerocketShadowSelector { node_name: "ip-172-22-45-41", node_uid: "85bee1f8-badd-48b6-98e5-d147a05050a7" }","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","request_id":"6ccffab1-2fc3-4862-b98e-af0e81cc58ae"}
{"v":0,"name":"apiserver","msg":"[CREATE_NODE - START]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:14.284490053+00:00","target":"models::node::client","line":175,"file":"models/src/node/client.rs","http.route":"/bottlerocket-node-resource","http.client_ip":"172.22.45.3:35296","otel.kind":"server","http.method":"POST","selector":"BottlerocketShadowSelector { node_name: "ip-172-22-45-41", node_uid: "85bee1f8-badd-48b6-98e5-d147a05050a7" }","http.user_agent":"","http.scheme":"http","http.target":"/bottlerocket-node-resource","node_name":"ip-172-22-45-41.","http.flavor":"1.1","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","request_id":"6ccffab1-2fc3-4862-b98e-af0e81cc58ae"}
{"v":0,"name":"apiserver","msg":"[CREATE_NODE - END]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:14.296216413+00:00","target":"models::node::client","line":175,"file":"models/src/node/client.rs","http.route":"/bottlerocket-node-resource","http.client_ip":"172.22.45.3:35296","otel.kind":"server","http.method":"POST","selector":"BottlerocketShadowSelector { node_name: "ip-172-22-45-41", node_uid: "85bee1f8-badd-48b6-98e5-d147a05050a7" }","http.user_agent":"","http.scheme":"http","http.target":"/bottlerocket-node-resource","node_name":"ip-172-22-45-41","elapsed_milliseconds":11,"http.flavor":"1.1","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","request_id":"6ccffab1-2fc3-4862-b98e-af0e81cc58ae"}
{"v":0,"name":"apiserver","msg":"[HTTP REQUEST - END]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:14.296309780+00:00","target":"apiserver::telemetry","line":40,"file":"apiserver/src/telemetry.rs","http.route":"/bottlerocket-node-resource","http.client_ip":"172.22.45.3:35296","otel.kind":"server","http.method":"POST","http.user_agent":"","http.scheme":"http","http.target":"/bottlerocket-node-resource","node_name":"ip-172-22-45-41","http.status_code":200,"http.flavor":"1.1","otel.status_code":"OK","elapsed_milliseconds":25,"http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","request_id":"6ccffab1-2fc3-4862-b98e-af0e81cc58ae"}
{"v":0,"name":"apiserver","msg":"[HTTP REQUEST - START]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:34.419056367+00:00","target":"apiserver::telemetry","line":40,"file":"apiserver/src/telemetry.rs","otel.kind":"server","node_name":"ip-172-22-45-41.","http.target":"/bottlerocket-node-resource","http.route":"/bottlerocket-node-resource","http.flavor":"1.1","http.scheme":"http","http.method":"PUT","http.client_ip":"172.22.45.3:36288","http.user_agent":"","request_id":"50655bee-95ca-41cd-9f47-024ef944465e","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local"}
{"v":0,"name":"apiserver","msg":"[CHECK_REQUEST_AUTHORIZED - START]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:34.419431744+00:00","target":"apiserver::auth::authorizor","line":49,"file":"apiserver/src/auth/authorizor.rs","otel.kind":"server","node_name":"ip-172-22-45-41","node_selector":"BottlerocketShadowSelector { node_name: "ip-172-22-45-41", node_uid: "85bee1f8-badd-48b6-98e5-d147a05050a7" }","http.target":"/bottlerocket-node-resource","http.route":"/bottlerocket-node-resource","http.flavor":"1.1","http.scheme":"http","http.method":"PUT","http.client_ip":"172.22.45.3:36288","http.user_agent":"","request_id":"50655bee-95ca-41cd-9f47-024ef944465e","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local"}
{"v":0,"name":"apiserver","msg":"[CHECK_REQUEST_AUTHORIZED - END]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:34.427048024+00:00","target":"apiserver::auth::authorizor","line":49,"file":"apiserver/src/auth/authorizor.rs","otel.kind":"server","node_name":"ip-172-22-45-41","node_selector":"BottlerocketShadowSelector { node_name: "ip-172-22-45-41", node_uid: "85bee1f8-badd-48b6-98e5-d147a05050a7" }","http.target":"/bottlerocket-node-resource","http.route":"/bottlerocket-node-resource","elapsed_milliseconds":7,"http.flavor":"1.1","http.scheme":"http","http.method":"PUT","http.client_ip":"172.22.45.3:36288","http.user_agent":"","request_id":"50655bee-95ca-41cd-9f47-024ef944465e","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local"}
{"v":0,"name":"apiserver","msg":"[UPDATE_NODE_STATUS - START]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:34.427206667+00:00","target":"models::node::client","line":207,"file":"models/src/node/client.rs","otel.kind":"server","node_name":"ip-172-22-45-41","selector":"BottlerocketShadowSelector { node_name: "ip-172-22-45-41", node_uid: "85bee1f8-badd-48b6-98e5-d147a05050a7" }","status":"BottlerocketShadowStatus { current_version: "1.6.2", target_version: "1.7.0", current_state: Idle }","http.target":"/bottlerocket-node-resource","http.route":"/bottlerocket-node-resource","http.flavor":"1.1","http.scheme":"http","http.method":"PUT","http.client_ip":"172.22.45.3:36288","http.user_agent":"","request_id":"50655bee-95ca-41cd-9f47-024ef944465e","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local"}
{"v":0,"name":"apiserver","msg":"[UPDATE_NODE_STATUS - END]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:34.439024165+00:00","target":"models::node::client","line":207,"file":"models/src/node/client.rs","otel.kind":"server","node_name":"ip-172-22-45-41","selector":"BottlerocketShadowSelector { node_name: "ip-172-22-45-41", node_uid: "85bee1f8-badd-48b6-98e5-d147a05050a7" }","status":"BottlerocketShadowStatus { current_version: "1.6.2", target_version: "1.7.0", current_state: Idle }","http.target":"/bottlerocket-node-resource","http.route":"/bottlerocket-node-resource","elapsed_milliseconds":11,"http.flavor":"1.1","http.scheme":"http","http.method":"PUT","http.client_ip":"172.22.45.3:36288","http.user_agent":"","request_id":"50655bee-95ca-41cd-9f47-024ef944465e","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local"}
{"v":0,"name":"apiserver","msg":"[HTTP REQUEST - END]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:51:34.439123918+00:00","target":"apiserver::telemetry","line":40,"file":"apiserver/src/telemetry.rs","otel.kind":"server","node_name":"ip-172-22-45-41","http.status_code":200,"otel.status_code":"OK","http.target":"/bottlerocket-node-resource","http.route":"/bottlerocket-node-resource","elapsed_milliseconds":19,"http.flavor":"1.1","http.scheme":"http","http.method":"PUT","http.client_ip":"172.22.45.3:36288","http.user_agent":"","request_id":"50655bee-95ca-41cd-9f47-024ef944465e","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local"}
{"v":0,"name":"apiserver","msg":"[HTTP REQUEST - START]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:53:10.457604319+00:00","target":"apiserver::telemetry","line":40,"file":"apiserver/src/telemetry.rs","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.route":"/bottlerocket-node-resource","http.user_agent":"","request_id":"a3812c7c-1eeb-44bd-9dfb-603833a576c2","node_name":"ip-172-22-50-13","http.flavor":"1.1","http.scheme":"http","http.method":"PUT","http.client_ip":"172.22.48.120:42110","http.target":"/bottlerocket-node-resource","otel.kind":"server"}
{"v":0,"name":"apiserver","msg":"[CHECK_REQUEST_AUTHORIZED - START]","level":30,"hostname":"brupop-apiserver-745c5cffd9-7q79q","pid":1,"time":"2022-04-07T19:53:10.457716790+00:00","target":"apiserver::auth::authorizor","line":49,"file":"apiserver/src/auth/authorizor.rs","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.route":"/bottlerocket-node-resource","http.user_agent":"","request_id":"a3812c7c-1eeb-44bd-9dfb-603833a576c2","node_selector":"BottlerocketShadowSelector { node_name: "ip-172-22-50-13", node_uid: "49d7bbdd-7576-48e9-8bc6-657d1636543c" }","node_name":"ip-172-22-50-13","http.flavor":"1.1","http.scheme":"http","http.method":"PUT","http.client_ip":"172.22.48.120:42110","http.target":"/bottlerocket-node-resource","otel.kind":"server"}

@cbgbt
Copy link
Contributor

cbgbt commented Apr 8, 2022

Thanks for the report and logs. Are you using StatefulSet deployments? We recently merged a PR to fix #168, which sounds like the same problem. We're planning to release an updated Brupop with the fix.

@pmacieje
Copy link
Author

Yeap, I use almost all Kinds of objects on the cluster, StateFullSet as well.

@somnusfish
Copy link
Contributor

Hi, the release v0.2.1 includes the fix for StateFullSet. Could you try and let us know if you still meet the same issue?

@somnusfish
Copy link
Contributor

Hello pmacieje@, haven't heard from you for a long time. I would assume our new release solved the issue you mentioned here. I am going to close this issue. Please feel free to open a new issue if you have any questions or concerns.

@pmacieje
Copy link
Author

pmacieje commented May 8, 2022

Hi, there smonusfish, works better now, no more hanging issue, but I saw that there was no drain during update process of my bottlerockets, there was only apply new os version and api reboot. Is that correct ?

@somnusfish
Copy link
Contributor

The drain won't show up in STATE or TARGET STATE if you check BotttlerocketShadow using

kubectl get brs

The drain actually happens during the state change Idle-> StagedUpdate. You will be able to see the logs similar to the following in one of the api-server logs that actually handle the drain requests:

kubectl logs brupop-apiserver-7d7fb58566-htclc --follow

...
{"v":0,"name":"apiserver","msg":"[CORDON_NODE - START]","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.906750984+00:00","target":"models::node::client","line":265,"file":"models/src/node/client.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[CORDON_NODE - END]","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.918336671+00:00","target":"models::node::client","line":265,"file":"models/src/node/client.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","elapsed_milliseconds":11,"http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[DRAIN_NODE - START]","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.919590517+00:00","target":"models::node::client","line":280,"file":"models/src/node/client.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[DRAIN_NODE - START]","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.919815245+00:00","target":"models::node::drain","line":80,"file":"models/src/node/drain.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[FIND_TARGET_PODS - START]","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.919883222+00:00","target":"models::node::drain","line":112,"file":"models/src/node/drain.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[FIND_TARGET_PODS - END]","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.929653441+00:00","target":"models::node::drain","line":112,"file":"models/src/node/drain.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","elapsed_milliseconds":9,"http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[DRAIN_NODE - EVENT] Not draining Pod 'brupop-agent-gq25c': Pod is member of a DaemonSet","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.929879959+00:00","target":"models::node::drain","line":151,"file":"models/src/node/drain.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[EVICT_POD - START]","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.930435705+00:00","target":"models::node::drain","line":176,"file":"models/src/node/drain.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[EVICT_POD - EVENT] Attempting to evict pod brupop-apiserver-7d7fb58566-qsrj4","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.930522405+00:00","target":"models::node::drain","line":193,"file":"models/src/node/drain.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[EVICT_POD - START]","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.931230026+00:00","target":"models::node::drain","line":176,"file":"models/src/node/drain.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[EVICT_POD - EVENT] Attempting to evict pod cert-manager-b4d6fd99b-gb6h2","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.931288296+00:00","target":"models::node::drain","line":193,"file":"models/src/node/drain.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[EVICT_POD - START]","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.932070728+00:00","target":"models::node::drain","line":176,"file":"models/src/node/drain.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
{"v":0,"name":"apiserver","msg":"[EVICT_POD - EVENT] Attempting to evict pod cert-manager-cainjector-74bfccdfdf-xzntw","level":30,"hostname":"brupop-apiserver-7d7fb58566-qsrj4","pid":1,"time":"2022-05-08T21:02:20.932111017+00:00","target":"models::node::drain","line":193,"file":"models/src/node/drain.rs","http.target":"/bottlerocket-node-resource/cordon-and-drain","node_name":"ip-192-168-47-149.us-west-2.compute.internal","http.host":"brupop-apiserver.brupop-bottlerocket-aws.svc.cluster.local","http.user_agent":"","http.route":"/bottlerocket-node-resource/cordon-and-drain","http.flavor":"1.1","otel.kind":"server","request_id":"61cc250d-066c-4d63-969f-24a1d63237b4","selector":"BottlerocketShadowSelector { node_name: \"ip-192-168-47-149.us-west-2.compute.internal\", node_uid: \"8a949c0e-741d-4e6e-83d8-a234dc0e6ad9\" }","http.client_ip":"192.168.62.14:38524","http.method":"POST","http.scheme":"http"}
...

Since api-server can be restarted during the update, one possible missing logs could be check the log after apiserver restarting.

The other possible could be since Kubernetes API doesn't provide an implementation of drain, Brupop uses Pod Deletion and Eviction to remove all Pods from a give node, kubectl by default will not evict nodes under some criteria, so by default, we ignore:

  • DaemonSet Pods - The DaemonSet controller will not respect node cordons, so we don't battle it.
  • Mirror Pods - These are static and cannot be controlled.

If you are talking about pods belong to the above set, it's not drained as expected.

@somnusfish somnusfish reopened this May 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants