Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operator: Add the option to immediately decommission ghosts #13298

Merged
merged 2 commits into from
Sep 8, 2023

Conversation

joejulian
Copy link
Contributor

@joejulian joejulian commented Sep 6, 2023

When a broker's storage is deleted, it comes back with a new broker id. There's no indication as to why the data is gone or whether it can be recovered. Since the Cluster resource is deprecated and the cloud team would just like to have a method to delete it regardless, the consensus is that we should assume it's not recoverable and decommission the old broker id immediately.

This PR adds a hidden flag, unsafe-decommission-failed-brokers that, if set to true, will offer this behavior.

This is dangerous and shouldn't be used. There are circumstances in which brokers with valid data can be decommissioned and even the potential for all the brokers to be decommissioned.

Fixes #13132

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.2.x
  • v23.1.x
  • v22.3.x

Release Notes

  • none

@joejulian joejulian force-pushed the boo_a_ghost branch 4 times, most recently from fabca06 to 51ff50c Compare September 6, 2023 22:56
@joejulian joejulian changed the title Add the option to immediately decommission ghosts operator: Add the option to immediately decommission ghosts Sep 6, 2023
alejandroEsc
alejandroEsc previously approved these changes Sep 7, 2023
Copy link
Contributor

@alejandroEsc alejandroEsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions and one suggestion, nothing blocking, leaving it up to you.

src/go/k8s/controllers/redpanda/cluster_controller.go Outdated Show resolved Hide resolved
src/go/k8s/controllers/redpanda/cluster_controller.go Outdated Show resolved Hide resolved
alejandroEsc
alejandroEsc previously approved these changes Sep 7, 2023
Copy link
Contributor

@alejandroEsc alejandroEsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

Comment on lines +136 to +137
flag.BoolVar(&ghostbuster, "unsafe-decommission-failed-brokers", false, "Set to enable decommissioning a failed broker that is configured but does not exist in the StatefulSet (ghost broker). This may result in invalidating valid data")
_ = flag.CommandLine.MarkHidden("unsafe-decommission-failed-brokers")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this property be moved to soon deprecated cluster custom resource? I don't see a value to deprecate flags along with the cluster custom resource.

When a broker's storage is deleted, it comes back with a new broker id.
There's no indication as to why the data is gone or whether it can be
recovered. Since the Cluster resource is deprecated and the cloud team
would just like to have a method to delete it regardless, the consensus
is that we should assume it's not recoverable and decommission the old
broker id immediately.

This PR adds a hidden flag, `unsafe-decommission-failed-brokers` that,
if set to true, will offer this behavior.

This is dangerous and shouldn't be used. There are circumstances in which
brokers with valid data can be decommissioned and even the potential for
all the brokers to be decommissioned.
@joejulian joejulian merged commit f18303a into redpanda-data:dev Sep 8, 2023
21 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v23.2.x

RafalKorepta added a commit to redpanda-data/redpanda-operator that referenced this pull request Jun 19, 2024
Calling decommission in the case of changing Pod annotation might be not
possible if Pod was removed along with its annotation where previous
Redpanda ID was stored. There is dedicated function to handle Ghost
brokers.

Reference

redpanda-data/redpanda#9750

redpanda-data/redpanda#13298
redpanda-data/redpanda#13132

redpanda-data/helm-charts#253
redpanda-data/redpanda#12847
RafalKorepta added a commit to redpanda-data/redpanda-operator that referenced this pull request Jun 21, 2024
Calling decommission in the case of changing Pod annotation might be not
possible if Pod was removed along with its annotation where previous
Redpanda ID was stored. There is dedicated function to handle Ghost
brokers.

Reference

redpanda-data/redpanda#9750

redpanda-data/redpanda#13298
redpanda-data/redpanda#13132

redpanda-data/helm-charts#253
redpanda-data/redpanda#12847
RafalKorepta added a commit to redpanda-data/redpanda-operator that referenced this pull request Jun 28, 2024
Calling decommission in the case of changing Pod annotation might be not
possible if Pod was removed along with its annotation where previous
Redpanda ID was stored. There is dedicated function to handle Ghost
brokers.

Reference

redpanda-data/redpanda#9750

redpanda-data/redpanda#13298
redpanda-data/redpanda#13132

redpanda-data/helm-charts#253
redpanda-data/redpanda#12847
RafalKorepta added a commit to redpanda-data/redpanda-operator that referenced this pull request Jul 2, 2024
Calling decommission in the case of changing Pod annotation might be not
possible if Pod was removed along with its annotation where previous
Redpanda ID was stored. There is dedicated function to handle Ghost
brokers.

Reference

redpanda-data/redpanda#9750

redpanda-data/redpanda#13298
redpanda-data/redpanda#13132

redpanda-data/helm-charts#253
redpanda-data/redpanda#12847
RafalKorepta added a commit to redpanda-data/redpanda-operator that referenced this pull request Jul 2, 2024
Calling decommission in the case of changing Pod annotation might be not
possible if Pod was removed along with its annotation where previous
Redpanda ID was stored. There is dedicated function to handle Ghost
brokers.

Reference

redpanda-data/redpanda#9750

redpanda-data/redpanda#13298
redpanda-data/redpanda#13132

redpanda-data/helm-charts#253
redpanda-data/redpanda#12847
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decommission ghosted brokers using the Cluster controller
4 participants