-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/schemachanger: enable declarative schema changer to read comments from db #80380
sql/schemachanger: enable declarative schema changer to read comments from db #80380
Conversation
3fc96d9
to
3016208
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for breaking this out. I think you did the right thing, there's a lot here, and it will make the COMMENT ON
pull request a lot more straightforward to review.
I have a few objections, nothing too deep so don't wait for me to get back from vacation to merge this after addressing them.
Reviewed 33 of 40 files at r1, 9 of 9 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan)
pkg/sql/descmetadata/comment_cache.go, line 35 at r2 (raw file):
ie sqlutil.InternalExecutor comments map[scdecomp.CommentKey]string objIDsChecked map[catid.DescID]struct{}
nit: consider using a catalog.DescriptorIDSet
instead?
pkg/sql/descmetadata/comment_cache.go, line 110 at r2 (raw file):
"mf-get-table-comments", mf.txn, sessiondata.InternalExecutorOverride{User: security.RootUserName()},
Just curious, why do we need to query this table as root? Is this how it's done in the legacy schema changer? (I haven't checked)
pkg/sql/schemachanger/scbuild/builder_state.go, line 432 at r2 (raw file):
} case catalog.Schema: if err := b.commentCache.LoadCommentsForObjects(b.ctx, catalog.Table, c.backrefs.Ordered()); err != nil {
Schemas also contain types, not just tables. Why does LoadCommentsForObjects
require a descriptor type anyway? I don't understand the purpose of this filtering. Let's just warm the cache for all the back-referenced things, no?
pkg/sql/schemachanger/scdecomp/decomp.go, line 63 at r2 (raw file):
desc catalog.Descriptor, lookupFn func(id catid.DescID) catalog.Descriptor, ev ElementVisitor,
Consider moving the ElementVisitor
definition to this new dependencies.go
file, it seems like it would belong there.
pkg/sql/schemachanger/scdecomp/dependencies.go, line 26 at r2 (raw file):
type CommentKey struct { ObjectID catid.DescID SubID descpb.ID
Shouldn't this also be a catid.DescID
?
pkg/sql/schemachanger/scdecomp/dependencies.go, line 28 at r2 (raw file):
SubID descpb.ID CommentType keys.CommentType }
Why is this type defined here in scdecomp
at all?
pkg/sql/schemachanger/scdecomp/dependencies.go, line 32 at r2 (raw file):
// DescriptorCommentCache represent an interface to fetch metadata like // comment for descriptors. type DescriptorCommentCache interface {
The descriptor decomposition logic does not care that this is a cache, as far as it's concerned that's purely an implementation detail. I'm aware that's not the case for scbuild
though.
What do you think of having an interface here that just does Get
s and rename it CommentGetter
or something. Then in the builder define another named CommentCache
with the LoadCommentsForObjects
method and which embeds CommentGetter
?
pkg/sql/schemachanger/scdecomp/dependencies.go, line 35 at r2 (raw file):
// Get returns comment for a (objID, subID, commentType) tuple if exists. "ok" // returned indicates if the comment actually exists or not. Get(ctx context.Context, objID catid.DescID, subID descpb.ID, commentType keys.CommentType) (comment string, ok bool, err error)
How about replacing this one method with several, one for each comment type?GetDatabaseComment(ctx, dbID)
etc.
No need to have the system.comment
table schema leak everywhere.
pkg/sql/schemachanger/scdeps/sctestdeps/database_state.go, line 200 at r2 (raw file):
} return comments }
This is fine but have you considered instead having a comment cache constructor in descmetadata
which warms the cache by doing a SELECT * FROM system.comments LIMIT 1000
or something like that?
This way you could re-use it in the test deps (no way we'd ever have more than 1k comments) plus you would amortize a lot of round trips in prod for the common case where there are at most very few comments.
pkg/sql/schemachanger/scplan/plan_test.go, line 60 at r2 (raw file):
run := func(t *testing.T, d *datadriven.TestData) string { // TODO (Chengxiong): make this switch block to have only "build" and // "ops/deps" sections.
I suggest you either follow up with a PR shortly or file an issue to track this, otherwise we'll forget about it.
pkg/sql/schemachanger/scplan/plan_test.go, line 106 at r2 (raw file):
default: t.Fatal("not a supported descriptor metadata statement") }
Why have a switch block at all? Instead perhaps just tdb.Exec(t, stmt.SQL)
?
Same comment applies elsewhere in this PR.
6f3818c
to
fec7edb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @postamar)
pkg/sql/descmetadata/comment_cache.go, line 35 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
nit: consider using a
catalog.DescriptorIDSet
instead?
nice, How could I forget that!
pkg/sql/descmetadata/comment_cache.go, line 110 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Just curious, why do we need to query this table as root? Is this how it's done in the legacy schema changer? (I haven't checked)
Yeah, metadata updater and show create table
also use root user. I remember I tried something else and got permission denied.
pkg/sql/schemachanger/scbuild/builder_state.go, line 432 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Schemas also contain types, not just tables. Why does
LoadCommentsForObjects
require a descriptor type anyway? I don't understand the purpose of this filtering. Let's just warm the cache for all the back-referenced things, no?
Good question...my reasoning was that
- there's not comment on types, so I'm just being lazy here to treat everything as table. I agree it's confusing and I can add comment for that.
- the comment table is indexed starting with
commentType
.... so I think the better practice is to respect the type first. But this table is probably very small as you said. so type filtering here is not necessary, especially after we add theselect limit 1000
thing you suggested :)
pkg/sql/schemachanger/scdecomp/decomp.go, line 63 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Consider moving the
ElementVisitor
definition to this newdependencies.go
file, it seems like it would belong there.
good point, done.
pkg/sql/schemachanger/scdecomp/dependencies.go, line 26 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Shouldn't this also be a
catid.DescID
?
oops, we actually shouldn't use either since SubID
is not for a descriptor. Using a uint32 instead...
pkg/sql/schemachanger/scdecomp/dependencies.go, line 28 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Why is this type defined here in
scdecomp
at all?
moved to the implementation in descmetadata
.
pkg/sql/schemachanger/scdecomp/dependencies.go, line 32 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
The descriptor decomposition logic does not care that this is a cache, as far as it's concerned that's purely an implementation detail. I'm aware that's not the case for
scbuild
though.What do you think of having an interface here that just does
Get
s and rename itCommentGetter
or something. Then in the builder define another namedCommentCache
with theLoadCommentsForObjects
method and which embedsCommentGetter
?
yeah, that's brilliant~
pkg/sql/schemachanger/scdecomp/dependencies.go, line 35 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
How about replacing this one method with several, one for each comment type?
GetDatabaseComment(ctx, dbID)
etc.No need to have the
system.comment
table schema leak everywhere.
Good idea to have these higher level methods. refactored as suggested.
pkg/sql/schemachanger/scdeps/sctestdeps/database_state.go, line 200 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
This is fine but have you considered instead having a comment cache constructor in
descmetadata
which warms the cache by doing aSELECT * FROM system.comments LIMIT 1000
or something like that?This way you could re-use it in the test deps (no way we'd ever have more than 1k comments) plus you would amortize a lot of round trips in prod for the common case where there are at most very few comments.
good idea actually. but here in the test we use the test db runner instead of internal executor, so probably the only thin in common is the query.
But I do like the idea of warming the cache with a 1000 when it's constructed. added that.
pkg/sql/schemachanger/scplan/plan_test.go, line 60 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
I suggest you either follow up with a PR shortly or file an issue to track this, otherwise we'll forget about it.
Yeah... I'm an old school and wrote on my notebook with an actual pen.
pkg/sql/schemachanger/scplan/plan_test.go, line 106 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Why have a switch block at all? Instead perhaps just
tdb.Exec(t, stmt.SQL)
?Same comment applies elsewhere in this PR.
that's true...removed.
cb52379
to
9b03137
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only very minor comments nothing blocking
Reviewed 15 of 17 files at r3, 7 of 7 files at r4, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @chengxiong-ruan and @postamar)
pkg/sql/descmetadata/comment_cache.go
line 119 at r4 (raw file):
} var uncheckedObjIDs []catid.DescID
Probably won't be a problem, but you can do uncheckedObjIDs := objIDs[:0], which will keep the same storage length so that you won't have growth if there are more than a 1000 comments.
pkg/sql/descmetadata/comment_cache.go
line 199 at r4 (raw file):
// to avoid boundary cut by `LIMIT 1000`. We might lose 1 object here, but not // a big deal in most cases. mf.maxWarmedObjID = maxObjID - 1
objIDSChecked isn't updated above? Also, I don't get why we have to subtract 1 here? The check can always do >= maxWarmedObjID?
… from db This commit adds a metadata fetcher interface to read comments from db during descriptor elements walking. An implementation of the interface is also added and plumbed into the descriptor decomposition. Release note: None
9b03137
to
92a78d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @fqazi and @postamar)
pkg/sql/descmetadata/comment_cache.go
line 119 at r4 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
Probably won't be a problem, but you can do uncheckedObjIDs := objIDs[:0], which will keep the same storage length so that you won't have growth if there are more than a 1000 comments.
ah nice trick, changed it to make([]descid.DescID, 0, len(objIDs))
since looks like uncheckedObjIDs := objIDs[:0]
would make them share the underlying storage.
pkg/sql/descmetadata/comment_cache.go
line 199 at r4 (raw file):
objIDSChecked isn't updated above?
oh, we don't need to update objIDsChecked
here because of this logic:
func (mf *metadataCache) shouldLoadFromDB(objID catid.DescID) bool {
return objID > mf.maxWarmedObjID && !mf.objIDsChecked.Contains(objID)
}
I just didn't want to put a 1000 ids by hand in it lol.
Also, I don't get why we have to subtract 1 here? The check can always do >= maxWarmedObjID?
Yeah, we may do >= maxWarmedObjID
as well. My brain model just worked that way somehow haha. The reason to subtract 1 here is just my laziness to handle edge cases where a table object_id has say 10 comments on different sub_ids, but because this table is that last obj we read from db, and we only read 5 comments of it, so we can't say this object_id is fully checked. Yes, it's just me lazy ha.
Build succeeded: |
This commit adds a metadata fetcher interface to read comments
from db during descriptor elements walking. An implementation
of the interface is also added and plumbed into the descriptor
decomposition.
Release note: None