Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use sequence_staged for alter cluster ddl #27797

Merged
merged 1 commit into from
Jun 24, 2024

Conversation

jubrad
Copy link
Contributor

@jubrad jubrad commented Jun 21, 2024

Motivation

This refactor should have no functional change, but will prepare us to implement graceful reconfigure using sequence_staged

Graceful reconfiguration will require sleeps or long running checks inside of the alter cluster ddl. We want these waits to be
cancelable and moved off the main coord thread. This is perfect for sequence_staged to handle.

Tips for reviewer

Checklist

@jubrad jubrad requested a review from a team as a code owner June 21, 2024 15:02
@jubrad jubrad requested a review from ParkMyCar June 21, 2024 15:02
@jubrad jubrad marked this pull request as draft June 21, 2024 15:02
@jubrad jubrad force-pushed the cluster-sequence-staged branch 4 times, most recently from 13d0e8c to c39227a Compare June 21, 2024 17:50
@jubrad jubrad marked this pull request as ready for review June 21, 2024 19:57
let validity = PlanValidity {
transient_revision: self.catalog().transient_revision(),
dependency_ids: BTreeSet::new(),
cluster_id: None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add the cluster_id from plan here. This way if the cluster is concurrently altered in a different way we should bail

Comment on lines +216 to +234
self.sequence_alter_cluster_managed_to_managed(
Some(session),
cluster_id,
config,
new_config,
ReplicaCreateDropReason::Manual,
)
.await?;
}
(Unmanaged, Managed(new_config)) => {
self.sequence_alter_cluster_unmanaged_to_managed(
session, cluster_id, new_config, options,
)
.await?;
}
(Managed(_), Unmanaged) => {
self.sequence_alter_cluster_managed_to_unmanaged(session, cluster_id)
.await?;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now these all still happen on the main coordinator thread. We can easily move them off-thread now by spawning them in a tokio task and returning a handle to that task, is that desirable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now I think no?

I believe we only want to do that for alter_managed_to_managed if performing a "graceful" reconfig, although we could potentially kick all of these off on a thread and return with StageResult::Handle or HandleRetire.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these do any long running async work like reach out to kubernetes and waiting for it to create resources then it might be nice to move them off the coord thread, but def not blocking this PR since we're maintaining the status quo here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That requires refactoring the orchestrator/compute controller which few people own or understand. Definitely the right call to merge this, which puts us into position to incrementally improve this later on.

Copy link
Member

@ParkMyCar ParkMyCar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet!

@jubrad jubrad merged commit 0c65d7e into MaterializeInc:main Jun 24, 2024
76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants