-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: Implement SHOW CREATE for managed clusters #27801
Conversation
7b8e2f8
to
7ed6d80
Compare
7ed6d80
to
f578904
Compare
This commit implements SHOW CREATE for managed clusters. It also gives us a blueprint for how to convert catalog objects back to SQL statements if we don't explicitly save the SQL statement. The process looks roughly like: 1. Convert the catalog object to a plan. 2. Convert the plan to a statement. 3. Pretty print the statement to a string. This process is designed to exactly mirror the process of converting SQL to a catalog object. Works towards resolving MaterializeInc#15435
f578904
to
deeb932
Compare
MitigationsCompleting required mitigations increases Resilience Coverage.
Risk Summary:The pull request presents a high risk with a score of 80, indicating a significant chance of introducing a bug. This assessment is driven by predictors like the sum of bug reports of files affected and the change in executable lines of code. Historically, pull requests with these characteristics have been 109% more likely to cause a bug compared to the repository's baseline. Additionally, there are two files modified in this pull request that have seen a recent increase in bug fixes. While the repository's observed bug trend remains steady, the predicted trend for bugs is on the rise. Note: The risk score is not based on semantic analysis but on historical predictors of bug occurrence in the repository. The attributes above were deemed the strongest predictors based on that history. Predictors and the score may change as the PR evolves in code, time, and review activity. Bug Hotspots:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet! So glad to finally have this knocked out. LGTM modulo testing.
One tricky part will be to ensure that both directions of transformations (i.e.
Statement
->Plan
->Object
andObject
->Plan
->Statement
) remain in sync going forward.
Worth getting @MaterializeInc/testing to weigh in here. The best answer in a situation like this is randomized automated testing. I wonder if we could rig up testdrive or SLT to automatically sniff out CREATE CLUSTER
commands and run SHOW CREATE CLUSTER
against them immediately after their creation and verify that the results match, or something like that. Would protect against new options getting added in the future that don't unplan correctly.
LGTM from a SQL council perspective too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing the docs, @jkosh44! +1 from me, as a SQL Council member.
@def-, what do you think? |
I'll work on it. |
We should teach CREATE/ALTER CLUSTER sequencing to do a round trip through SHOW always and panic (dev) or return an error to the user and abort the catalog txn (prod) if it fails. Then we don't have to modify testing at all and we have certainty that we've never committed catalog state that we can't print correctly. |
Seems reasonable. Easy to do for |
The method doesn't need to compare the printed strings. It parses the printed string and asserts that the internal data structures are the same. |
Ah, I see, you'd actually run the printed string through the parser and planner (and sequencer??) and make sure that you get the same cluster object out. Makes sense. |
Could we generalize that for every query we serialize and verify that it roundtrips successfully? Would save us a lot of effort on manually thinking of test cases or hoping that we have all the relevant grammar for randomized testing. And most importantly this shows the user a nice error immediately when running the query instead of having problems the next time envd happens to restart. |
We can't do exactly this, because we only have the actual |
I did this in planning because it was much more straight forward. |
// Roundtrip through unplan and make sure that we end up with the same plan. | ||
if let CreateClusterVariant::Managed(_) = &plan.variant { | ||
let stmt = unplan_create_cluster(scx, plan.clone()) | ||
.map_err(|e| PlanError::Replan(e.to_string()))?; | ||
let create_sql = stmt.to_ast_string_stable(); | ||
let stmt = parse::parse(&create_sql) | ||
.map_err(|e| PlanError::Replan(e.to_string()))? | ||
.into_element() | ||
.ast; | ||
let (stmt, _resolved_ids) = | ||
names::resolve(scx.catalog, stmt).map_err(|e| PlanError::Replan(e.to_string()))?; | ||
let stmt = match stmt { | ||
Statement::CreateCluster(stmt) => stmt, | ||
stmt => { | ||
return Err(PlanError::Replan(format!( | ||
"replan does not match: plan={plan:?}, create_sql={create_sql:?}, stmt={stmt:?}" | ||
))) | ||
} | ||
}; | ||
let replan = | ||
plan_create_cluster_inner(scx, stmt).map_err(|e| PlanError::Replan(e.to_string()))?; | ||
if plan != replan { | ||
return Err(PlanError::Replan(format!( | ||
"replan does not match: plan={plan:?}, replan={replan:?}" | ||
))); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we start to have more unplan
s, this could be generalized to look something like this:
fn plan(stmt: Statement<Aug>) -> Result<Plan, PlanError> {
let plan = plan_inner(stmt)?;
if let Some(stmt) = unplan(plan.clone()) {
let create_sql = stmt.to_ast_string_stable();
let stmt = parse::parse(&create_sql)
.map_err(|e| PlanError::Replan(e.to_string()))?
.into_element()
.ast;
let (stmt, _resolved_ids) =
names::resolve(scx.catalog, stmt).map_err(|e| PlanError::Replan(e.to_string()))?;
let replan = plan_inner(stmt).map_err(|e| PlanError::Replan(e.to_string()))?;
if plan != replan {
return Err(PlanError::Replan(format!(
"replan does not match: plan={plan:?}, replan={replan:?}"
)));
}
}
Ok(plan)
}
fn plan_inner(stmt: Statement<Aud>) -> Result<Plan, PlanError> {
...
}
fn unplan(plan: Plan) -> Result<Option<Statement<Aug>>, PlanError> {
match plan {
Plan::CreateClusterPlan(plan) => Ok(Some(unplan_create_cluster(plan)?)),
...
// These plans cannot be unplanned.
_ => Ok(None),
}
}
Nice! I took a quick look and it looks solid, but deferring (re-)approval to someone with more bandwidth. |
This uncovered an issue. You cannot specify the |
Bootstrap planning uses the "turn on all the flags mode". Should we use that here too? |
vec![value] | ||
); | ||
for value in values { | ||
if let Some(value) = value.try_into_value(catalog) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand what's going on here but: why is it ok to ignore values that fail try_into_value
and not error or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't blame you, this macro was not easy for me to figure out and update. I'll try and explain it here and in the process maybe I'll come up with a good comment to add that also explains it.
Before my PR this macro allowed you to convert a list of options into some struct that had a field for each option. So for example when processing CREATE CLUSTER
options, you would start with a Vec<ClusterOption<Aug>>
.
/// An option in a `CREATE CLUSTER` statement.
pub struct ClusterOption<T: AstInfo> {
pub name: ClusterOptionName,
pub value: Option<WithOptionValue<T>>,
}
You would convert that into a struct that looks like the following (the struct is generated by the macro so there's no where in the code to look for this exact struct):
struct ClusterOptionExtracted {
availability_zones: Option<Vec<String>>,
disk: Option<bool>,
managed: Option<bool>,
...
}
The Vec
only contains options that were explicitly specified. So a field in ClusterOptionExtracted
would be Some(_)
if the option was specified and None
if it was not.
Now this commit adds the reverse process, it takes a ClusterOptionExtracted
and converts it back into a Vec<ClusterOption<Aug>>
. So if the field is Some(_)
then we need to generate a ClusterOption<Aug>
and add it to the Vec
. However, if the field is None
, then there was no option specified and we have nothing to add back to the list. That's why we ignore the None
values here, they are not errors they just don't exist as specified options.
}; | ||
let replan = | ||
plan_create_cluster_inner(scx, stmt).map_err(|e| PlanError::Replan(e.to_string()))?; | ||
if plan != replan { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also be in ALTER CLUSTER
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What comparison should we do for ALTER CLUSTER
? We don't have a CreateClusterPlan
to compare to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For that I think we'd have to move the check into sequencing and do the full roundtrip of Cluster -> CreateClusterPlan -> CreateClusterStatement -> String -> CreateClusterStatement -> CreateClusterPlan -> Cluster
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah. ALTER has an updated catalog so it could produce a sql string from that to plan. But we need to compare it to the Cluster
object in the catalog, which is the sequenced plan, not the plan? I'm ok merging as is, but I do very much think we should complete the thought because ALTER
seems like a likely place where we make a small mistake and either corrupt or have silent mutation to some object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like this commit doesn't add any new hazards to ALTER CLUSTER
, but this feature could be used as a tool to protect against existing or future hazards in ALTER CLUSTER
? If I'm understanding that correctly, then I'm inclined to merge as is and let someone else add that to ALTER CLUSTER
as a follow up.
I don't think that would quite work. This would cause the re-planning to succeed, but generate invalid SQL. For example we'd end up with something like the following:
It is true that cluster
Here, we've omitted the I did come up with an idea last night. We update the feature flag requirement so that it only fires when the user specifies the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should add a test that unmanaged clusters return an error.
}; | ||
let replan = | ||
plan_create_cluster_inner(scx, stmt).map_err(|e| PlanError::Replan(e.to_string()))?; | ||
if plan != replan { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah. ALTER has an updated catalog so it could produce a sql string from that to plan. But we need to compare it to the Cluster
object in the catalog, which is the sequenced plan, not the plan? I'm ok merging as is, but I do very much think we should complete the thought because ALTER
seems like a likely place where we make a small mistake and either corrupt or have silent mutation to some object.
This commit implements SHOW CREATE for managed clusters. It also gives
us a blueprint for how to convert catalog objects back to SQL
statements if we don't explicitly save the SQL statement. The process
looks roughly like:
This process is designed to exactly mirror the process of converting
SQL to a catalog object.
Works towards resolving #15435
Motivation
This PR adds a known-desirable feature.
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.SHOW CREATE
for managed clusters.