Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup brs when brupop is removed from the node. #235

Merged
merged 1 commit into from
Sep 22, 2022

Conversation

gthao313
Copy link
Member

@gthao313 gthao313 commented Aug 11, 2022

Issue number:
#107

Description of changes:

Author: Tianhao Geng <tianhg@amazon.com>
Date:   Thu Aug 11 19:32:12 2022 +0000

    cleanup brs when the brupop is removed from the node

    Currently we require customers to manually clean up the Custom Resources
    if they choose to delete only the updater-interface-version label. This
    change would help on automatically cleaning up BRS if customer removes
    brupop(label) from corresponding node.

commit b81eefc195f4669510146793acf64b3c5bc959e9
Author: Tianhao Geng <tianhg@amazon.com>
Date:   Thu Aug 11 18:36:49 2022 +0000

    Add deletion support to brs client

    According to brs cleanup strategy, we support brs client to be able to
    delete brs object.

Testing done:
Method: Run brupop integration test. Remove brupop label from nodes, and then confirm if associated brs has been removed.

Test with three scenarios

  • [Waiting-for-update] The node is waiting for update and customers remove the labels
  • [In-progress] Customers remove the labels when brupop is preforming update on the node.
  • [Complete] The node has been updated, and customers remove the labels.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@gthao313 gthao313 force-pushed the brs-cleanup branch 2 times, most recently from 2cb2f08 to 00e817a Compare August 12, 2022 18:21
@gthao313 gthao313 marked this pull request as ready for review August 12, 2022 18:22
@gthao313
Copy link
Member Author

Push above removing changes on apisever. Since controller has privilege to delete cluster resources, we don't need add deletion to apiserver.

controller/src/lib.rs Outdated Show resolved Hide resolved
},
_ = node_drainer => {
event!(Level::ERROR, "node reflector drained");
return Err(controller_error::Error::KubernetesWatcherFailed {object: "node".to_string()});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason returning Err here? The Result<()> in main is actually not parsed by any other handler. If this is for log purpose, does it make sense to log in the event! rather than return Err?

@@ -2,12 +2,16 @@ use super::{
metrics::{BrupopControllerMetrics, BrupopHostsData},
statemachine::determine_next_node_spec,
};
use k8s_openapi::api::core::v1::Node;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would you mind sort the external import in alphabetical order?

.map(|arc_brs| (**arc_brs).clone())
.collect()
}

/// Returns the set of BottlerocketShadow objects which is currently being acted upon.
///
/// Nodes are being acted upon if they are not in the `WaitingForUpdate` state, or if their desired state does
/// not match their current state.
#[instrument(skip(self))]
fn active_node_set(&self) -> BTreeMap<String, BottlerocketShadow> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe active_brs_set would be more clear?

@@ -111,7 +133,7 @@ impl<T: BottlerocketShadowClient> BrupopController<T> {
/// set during the next iteration of the controller's event loop.
#[instrument(skip(self))]
async fn find_and_update_ready_node(&self) -> Result<Option<BottlerocketShadow>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, may be find_and_update_ready_brs?

@@ -205,6 +264,11 @@ impl<T: BottlerocketShadowClient> BrupopController<T> {
}
}

// Cleanup BRS when the operator is removed from a node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if it would be better to add more explanation here to illustrate that even the above steps moved a invalid Brs to active, the clean up would catch later here.

controller/src/controller.rs Outdated Show resolved Hide resolved

for unlabeled_node in unlabeled_nodes.drain(..) {
let associated_bottlerocketshadow = brs_name_from_node_name(&unlabeled_node.name());
if brss
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any use for BottlerocketShadow when deleting the shadow. Does it make sense to get the BottlerocketShadow's name in HashSet to avoid iterating on all of the brss for each unlabeld_node?

) -> Result<()> {
let mut unlabeled_nodes = find_unlabeled_nodes(nodes);

for unlabeled_node in unlabeled_nodes.drain(..) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I understand correctly, the drain() actually take the ownership for the unlabeld_nodes, but in this use case we only need the reference for the unlabled_node.name(), does it make sense to use reference for unlabeled_nodes? Or is there any reason to use drain here?

controller/src/controller.rs Show resolved Hide resolved
Cargo.lock Show resolved Hide resolved
controller/src/controller.rs Outdated Show resolved Hide resolved
controller/src/controller.rs Outdated Show resolved Hide resolved
controller/src/controller.rs Outdated Show resolved Hide resolved
Currently we require users to manually clean up the Custom Resources
if they choose to delete only the updater-interface-version label. This
change would help on automatically cleaning up BRS if users remove
brupop(label) from corresponding node.
@gthao313 gthao313 requested a review from jpmcb September 21, 2022 17:13
self.brs_reader
.state()
.iter()
.map(|arc_brs| (**arc_brs).clone())
.collect()
}

/// Returns a list of all bottlerocket nodes in the cluster.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/bottlerocket/Bottlerocket

&DeleteParams::default(),
)
.await
.context(controllerclient_error::DeleteNode)?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to log an event after the node is deleted?

@gthao313 gthao313 merged commit 744dd8f into bottlerocket-os:develop Sep 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants