-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
catalog,tabledesc: add & adopt Constraint subtypes #92289
Conversation
3e44552
to
6d6ba24
Compare
This is ready for review. It's a big one, which I apologize for. It's hard to break down this change into smaller parts. I'll add annotations to help the reviewing process. I advise reviewers to take a look at the interface definitions in |
for k := range info { | ||
ckBuilder.MarkNameInUse(k) | ||
for _, c := range n.tableDesc.AllConstraints() { | ||
ckBuilder.MarkNameInUse(c.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of these instances where we're more careful to not re-use a name. Things can get quite confusing otherwise when rolling back a schema change involving a name collision.
"constraint %q in the middle of being added, try again later", t.Constraint) | ||
case descpb.ConstraintValidity_Dropping: | ||
return pgerror.Newf(pgcode.ObjectNotInPrerequisiteState, | ||
"constraint %q in the middle of being dropped", t.Constraint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of these instances where the adding and dropping states are handled more uniformly, to no adverse effect from what I can tell.
details, ok := info[string(t.Constraint)] | ||
if !ok { | ||
constraint, _ := n.tableDesc.FindConstraintWithName(string(t.Constraint)) | ||
if constraint == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So many calls to GetConstraintInfo
were really about finding a constraint by name. Though, notably, GetConstraintInfo
would not return the full set of constraints (no dropping unique indexes for instance). In this case here I don't think that mattered but I'm willing to bet this caused some subtle undetected bugs at some other GetConstraintInfo
call sites.
for _, uwoi := range tableDesc.EnforcedUniqueConstraintsWithoutIndex() { | ||
if uwoi.Dropped() || !uwoi.CollectKeyColumnIDs().Contains(colToDrop.GetID()) { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think (but I'm not sure) that omitting dropped unique-without-indexes was a bug.
err = p.tryRemoveFKBackReferences( | ||
ctx, | ||
tableDesc, | ||
tableDesc.GetPrimaryIndex(), | ||
tree.DropRestrict, | ||
remainingIndexes) | ||
withSearchForReplacement, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See function definition for details but we no longer pass slices of descpb.UniqueConstraint
interfaces, because we got rid of that interface.
|
||
// GetName returns the constraint name. | ||
GetName() string | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been superseded by catalog.UniqueConstraint
, in effect.
@@ -719,11 +702,12 @@ func (desc *wrapper) ValidateSelf(vea catalog.ValidationErrorAccumulator) { | |||
// Only validate column families, constraints, and indexes if this is | |||
// actually a table, not if it's just a view. | |||
if desc.IsPhysicalTable() { | |||
desc.validateConstraintNamesAndIDs(vea) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this further ahead was necessary because we got rid of CheckUniqueConstraints
, but the price to pay was lots of noise in validate_test.go
.
if index.IsHelpfulOriginIndex(fk.OriginColumnIDs) { | ||
return notices | ||
} | ||
if index.IsPartial() || index.NumKeyColumns() == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index.NumKeyColumns() == 0
can legitimately happen for the primary index of a virtual table. Of course, such tables don't have primary indexes at all, so I'm not sure how much I like the fact that we pretend that we do, but this is not the time or place to deal with this.
alter table const_tbl drop constraint "steve"; | ||
alter table const_tbl add constraint "steve" CHECK (a > 100); | ||
COMMIT; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just don't think that this should succeed. Use another name.
pkg/sql/catalog/table_elements.go
Outdated
// AsCheck returns the corresponding CheckConstraint if the | ||
// mutation is on a check constraint, nil otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is the comment stale? Why do we say "...if the mutation is on a check constraint..."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, thanks
6d6ba24
to
8933bd8
Compare
allChecks []catalog.Constraint | ||
allActiveAndInactiveChecks []catalog.Constraint | ||
allActiveChecks []catalog.Constraint | ||
// AsCheck implements the catalog.ConstraintProvider interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we always have a line like var _ catalog.ConstraintProvider = constraintBase{}
to enforce implementation of an interface in compile time. Same is true for all other structs that we claim to have implemented some interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see you are doing this for the specific constraint type (like checkConstraint). Great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks but yeah it's a good point. IIRC I try to have the assertions on the leaves of the type hierarchy.
// interface. | ||
func (c checkConstraint) GetReferencedColumnID(columnOrdinal int) descpb.ColumnID { | ||
if columnOrdinal < 0 || columnOrdinal >= len(c.desc.ColumnIDs) { | ||
return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we panic with assertionFailedf here instead of returning 0? Or it's the caller's responsibility to check whether return is 0 and handles it appropriately if so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your comment made me look what we actually do in catalog.Index
's implementation and I'd always assumed we did bound checks but it turns out that we don't.
I'll do what catalog.Index
does and remove these checks. I remember that back when that was implemented, this matter had been discussed. Evidently the consensus that came out of that was that we preferred triggering a runtime panic when going out of bounds. I'm OK with that.
|
||
// IsEnforced implements the catalog.Constraint interface. | ||
func (c checkConstraint) IsEnforced() bool { | ||
return !c.IsMutation() || c.WriteAndDeleteOnly() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the legacy code did a bad job about the state machine of constraints while it's still on the mutation slice. I think it's better to check the constraint mutation's validity instead of the MutationDescriptor_State, so I suggest we instead just do return c.desc.validity != dropping
. But this again iterates one of my comments above that maybe we can get rid of this method and just expose GetConstraintValidity
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To illustrate what I mean above a bit more: consider adding a check constraint. When we enqueue the check constraint onto the mutation slice, its MutationDescriptor_State will be (unfortunately) DELETE_ONLY but its validity is VALIDATING, which should be considered as Enforced but this method now will incorrectly returns false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A constraint's validity and whether it's enforced are two different things:
- the validity concerns the data that was present before the constraint is added;
- a constraint being enforced or not concerns the data that appears after the constraint is added.
|
||
// AsUniqueWithoutIndex implements the catalog.ConstraintProvider interface. | ||
func (c uniqueWithoutIndexConstraint) AsUniqueWithoutIndex() catalog.UniqueWithoutIndexConstraint { | ||
return &c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why &c
, not c
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In go, an interface is basically a pair of pointers, so it makes sense for the underlying type of what the interface is wrapping to also be a pointer type.
I'm not a huge fan of go's pointer magic in general, and hardly an expert.
) *constraintCache { | ||
capUWIs := len(indexes.all) | ||
numEnforcedChecks := len(desc.Checks) | ||
capChecks := numEnforcedChecks + len(mutations.checks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the naming seems to suggest all desc.Checks
are enforced checks and all mutations.checks
are not enforced checks, which isn't the case, right?
A check on the mutation with VALIDATING validity is considered enforced, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
}, | ||
// Populate with index-backed unique constraints. | ||
for _, idx := range indexes.all { | ||
if uwi := idx.AsUniqueWithIndex(); uwi != nil && uwi.NumKeyColumns() > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we check uwi.NumKeyColumns()
additionally? In what scenario will we have a non-nil uwi but 0 key columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Virtual tables. It's dumb, in those cases we should have GetPrimaryIndex()
return nil
, yet we don't. Yet more tech debt.
// Construct a table with the following constraints: | ||
// - Primary Index: [ID_1:validated] | ||
// - Indexes: [a non-unique index, ID_4:validated] | ||
// - Checks: [ID_2:validated], [ID_3:unvalidated] | ||
// - OutboundFKs: [ID_6:validated] | ||
// - InboundFKs: [ID_5:validated] | ||
// - UniqueWithoutIndexConstraints: [ID_7:dropping] | ||
// - mutation slice: [ID_7:dropping:UniqueWithoutIndex, ID_8:validating:Check, ID_9:validating:UniqueIndex, a non-unique index] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still feel like a comment block like this that summarizes all the constraints we put on this contrived table is helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I'll add it back.
// CheckConstraintColumns returns the slice of columns referenced by a check | ||
// constraint. | ||
CheckConstraintColumns(ck CheckConstraint) []Column | ||
// ForeignKeyReferencedColumns returns the slice of columns referenced by an | ||
// inbound foreign key. | ||
ForeignKeyReferencedColumns(fk ForeignKeyConstraint) []Column | ||
// ForeignKeyOriginColumns returns the slice of columns originating in this | ||
// table for an outbound foreign key. | ||
ForeignKeyOriginColumns(fk ForeignKeyConstraint) []Column | ||
// UniqueWithoutIndexColumns returns the slice of columns which are | ||
// defined as unique by a non-index-backed constraint. | ||
UniqueWithoutIndexColumns(uwoi UniqueWithoutIndexConstraint) []Column |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you think those methods belong here? It seems to me more appropriately to define them just as function that takes in a (ck|fk|uwoi) and return some property of that input constraint. Do we need anything from the table to serve those questions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. The table is required to map the column ID to a catalog.Column
, and I really prefer using the latter instead of IDs whenever possible.
See the similar []Column
-returning methods taking indexes. These can be quite powerful and removing this layer of indirection of the column ID makes for some more expressive code at the call sites. See informationSchemaConstraintColumnUsageTable` for an example. Arguably some of them are never called, but if they're there it's because they used to, and perhaps will be again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the changes made to the new constraint interfaces, their implementations, constructing constraintCache, mutationCache, and TableDescriptor interface. All are welcoming and pleasant changes.
The only concern I have the how we implement isEnforced
. I'm convinced that it's convenient/helpful to have this method though.
These new interfaces help encapsulate the descriptor protobufs for table constraints: - CheckConstraint, - ForeignKeyConstraint, - UniqueWithoutIndexConstraint, - UniqueWithIndexConstraint. These are the leaves of the Constraint type tree, and Constraint is the interface at the root of it. The changes in this commit mimic what was done for Column & Index, which respectively wrap descpb.ColumnDescriptor and descpb.IndexDescriptor. These new constraint interfaces are adopted liberally throughout the code base, but not completely: many references to descpb-protobufs still need to be replaced. This commit also removes the somewhat confusing concept of "active" versus "inactive" constraint. Not only was it confusing to me, evidently it was also confusing to other contributors considering how several mild bugs and inconsistencies made their way into the code. This commit replaces this concept with that of an "enforced" constraint: - A constraint is enforced if it applies to data written to the table, regardless of whether it has been validated on the data already in the table prior to the constraint coming into existence. The implementation is somewhat awkward for constraints in that the same constraint descriptor can be featured twice inside a descpb.TableDescriptor, in its active constraint slice (such as Checks, etc.) and also in the Mutations slice. In these cases the interface wraps the non-mutation constraint descriptor protobuf, with no observed changes to the database's behavior. This commit required alterations to the table descriptor's post-deserialization changes. The change which assigns constraint IDs to table descriptors which don't have themhad some subtle bugs which went unnoticed until this commit forced a heavier reliance on these fields. This commit also adds a new change which ensures that a check constraint's ColumnIDs slice is populated based on the columns referenced in its expression, replacing ancient code which predates the concept of post-deserialization changes and which would compute this lazily. As a more general observation: it appears that the handling of adding and dropping constraints following other schema changes was mildly inconsistent and this commit tries to improve this. For instance, it was possible to drop a column which had been set as NOT NULL in the same transaction, but not do the same for any other kind of constraint. Also it was possible to reuse the name of a dropping constraint, despite this constraint still being enforced. The new behavior is more strict, yet this should be barely noticeable by the user in most practical cases: it's rare that one wants to add a constraint referencing a column and also drop that column in the same transaction, for instance. Fixes cockroachdb#91918. Release note: None
3b1ecd9
to
98ec204
Compare
Thanks for taking a look. I've addressed or responded to your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and thanks for doing this!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @postamar)
pkg/sql/catalog/table_elements.go
line 425 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Dropping constraints are enforced. At least, that's the case of unique indexes. You can see this for yourself by setting a pausepoint on the schema changer execution, dropping a unique index, and INSERTing two identical values.
Thanks, that's a TIL!
pkg/sql/catalog/table_elements.go
line 474 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Eh, it's already in
catalog.Index
.
Thanks, another TIL: we can define uniqueness on a column with a predicate (which makes it a partial unique constraint) with or without a backing index
pkg/sql/catalog/tabledesc/constraint.go
line 130 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
A constraint's validity and whether it's enforced are two different things:
- the validity concerns the data that was present before the constraint is added;
- a constraint being enforced or not concerns the data that appears after the constraint is added.
This is enlighting! I always thought when it comes to constraint, their mutation state is irrelevant anymore. Thank you! This also means I need to make a one-line change to our adding path when adding a constraint where after I enqueue a constraint to the mutation slice, I will need to fast-forward its state to WRITE_ONLY.
Thanks for the review! bors r+ |
Build failed: |
The previous merge failed due to a flaky test; merging again bors r+ |
Build failed (retrying...): |
Build succeeded: |
This test checks the functionality for the following sequence of events: 1. A txn adds a constraint to a column on a table. 2. A separate txn drops the column. However, this interaction between the two txns has been made explicit by the PR cockroachdb#92289. Since this PR, step (2) will fail if the constraint in step (1) is in the process of being added. As a result, the elaborate set up and sequence of events being tested in TestDropColumnAfterMutations is no longer necessary. Release note: none Epic: none Fixes: cockroachdb#76843
107474: cli/zip: emit SQL table data using TSV by default r=abarganier a=knz Fixes #107473. Epic: CRDB-28893 This is a partial revert of 35738d4. It changes the default value of the `--format` flag back from JSON to TSV. Release note (backward-incompatible change): THIS RELEASE NOTE CANCELS THE CORRESPONDING PREVIOUS BACKWARD-INCOMPATIBLE CHANGE. New behavior, compatible with previous versions of CockroachDB: the command `cockroach debug zip` stores data retrieved from SQL tables in the remote cluster using the TSV format by default. Release note (cli change): The default value of the `--format` parameter to `cockroach debug zip` is `tsv`, like other CLI commands that can extract SQL data. 107489: sql: Delete invalid TestDropColumnAfterMutations test r=rafiss a=rimadeodhar This test checks the functionality for the following sequence of events: 1. A txn adds a constraint to a column on a table. 2. A separate txn drops the column. However, this interaction between the two txns has been made explicit by the PR #92289. Since this PR, step (2) will fail if the constraint in step (1) is in the process of being added. As a result, the elaborate set up and sequence of events being tested in TestDropColumnAfterMutations is no longer necessary. Release note: none Epic: none Fixes: #76843 107664: authors: add Angela Dietz to authors r=angeladietz a=angeladietz Release note: None Epic: None 107666: server: fix a race in tenant creation r=knz a=lidorcarmel Previously, scanTenantsForRunnableServices() was not holding the mutex when SELECTing for the existing tenant names, which means that the following may happen: - scanTenantsForRunnableServices() sees that only the system tenant exists - createServerEntryLocked() then adds another tenant while holding the mutex - scanTenantsForRunnableServices() takes the lock and stops the tenant that was just created because only the system tenant should be alive (which is wrong) This patch changes scanTenantsForRunnableServices() to take the mutex before SELECTing for the existing tenants in order to avoid the race. Epic: none Fixes: #107434 Fixes: #107343 Fixes: #107154 Release note: None 107673: opt: remove Metadata.AllUserDefinedFunctions r=mgartner a=mgartner The metadata method `AllUserDefinedFunctions` has been replaced with a new function `HasUserDefinedFunctions` which provides a simpler API without exposing the underlying UDF dependency map. The map is still available outside of the opt package via the `TestingUDFDeps` method which is designed for testing use only. Epic: None Release note: None 107714: roachtest: add warning to redacted github issue r=mgartner a=mgartner Epic: None Release note: None 107716: ui: extend search logic on insights page r=koorosh a=koorosh This change extends the number of fields where search is applied (instead of single transaction/ statement execution ID field). It makes possible to search for any available ID in Txn or statement insight. Release note (ui change): search is performed on all ID fields of transaction and statement insights. Resolves: #107253 Demo: https://github.com/cockroachdb/cockroach/assets/3106437/7fb56720-3ab2-4be4-9500-457707f6f01d Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net> Co-authored-by: rimadeodhar <rima@cockroachlabs.com> Co-authored-by: Angela Dietz <dietz@cockroachlabs.com> Co-authored-by: Lidor Carmel <lidor@cockroachlabs.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Andrii Vorobiov <and.vorobiov@gmail.com>
This test checks the functionality for the following sequence of events: 1. A txn adds a constraint to a column on a table. 2. A separate txn drops the column. However, this interaction between the two txns has been made explicit by the PR #92289. Since this PR, step (2) will fail if the constraint in step (1) is in the process of being added. As a result, the elaborate set up and sequence of events being tested in TestDropColumnAfterMutations is no longer necessary. Release note: none Epic: none Fixes: #76843
These new interfaces help encapsulate the descriptor protobufs for table
constraints:
These are the leaves of the Constraint type tree, and Constraint is the
interface at the root of it. The changes in this commit mimic what was
done for Column & Index, which respectively wrap descpb.ColumnDescriptor
and descpb.IndexDescriptor.
These new constraint interfaces are adopted liberally throughout the
code base, but not completely: many references to descpb-protobufs still
need to be replaced.
This commit also removes the somewhat confusing concept of "active"
versus "inactive" constraint. Not only was it confusing to me, evidently
it was also confusing to other contributors considering how several mild
bugs and inconsistencies made their way into the code. This commit
replaces this concept with that of an "enforced" constraint:
regardless of whether it has been validated on the data already in
the table prior to the constraint coming into existence.
The implementation is somewhat awkward for constraints in that the same
constraint descriptor can be featured twice inside
a descpb.TableDescriptor, in its active constraint slice (such as
Checks, etc.) and also in the Mutations slice. In these cases the
interface wraps the non-mutation constraint descriptor protobuf, with no
observed changes to the database's behavior.
This commit required alterations to the table descriptor's
post-deserialization changes. The change which assigns constraint IDs
to table descriptors which don't have themhad some subtle bugs which went
unnoticed until this commit forced a heavier reliance on these fields.
This commit also adds a new change which ensures that a check
constraint's ColumnIDs slice is populated based on the columns
referenced in its expression, replacing ancient code which predates the
concept of post-deserialization changes and which would compute this
lazily.
As a more general observation: it appears that the handling of adding
and dropping constraints following other schema changes was mildly
inconsistent and this commit tries to improve this. For instance, it was
possible to drop a column which had been set as NOT NULL in the same
transaction, but not do the same for any other kind of constraint. Also
it was possible to reuse the name of a dropping constraint, despite this
constraint still being enforced. The new behavior is more strict, yet
this should be barely noticeable by the user in most practical cases:
it's rare that one wants to add a constraint referencing a column and
also drop that column in the same transaction, for instance.
Fixes #91918.
Release note: None