-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server,roachpb: introduce span config rpcs #69047
Conversation
Haven't pushed out the tests yet, but if you want an early look. |
eb7a261
to
584f602
Compare
RFAL. |
5723cf4
to
a8bc32c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a first pass.
Some of the schema design stuff is scary to me. It's really not okay for any of these queries to be O(span configs)
and, it's not great to have so many synchronous round trips per entry in the batch.
My other concerns are around the API semantics. How do the two sets interact. Please spell this out in excruciating detail in the api proto.
Reviewed 6 of 37 files at r6, 90 of 90 files at r7, 34 of 34 files at r8, 31 of 31 files at r9, 2 of 39 files at r10.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @arulajmani, @irfansharif, and @nvanbenschoten)
pkg/ccl/kvccl/kvtenantccl/connector.go, line 533 at r10 (raw file):
// interface. func (c *Connector) UpdateSpanConfigEntries( ctx context.Context, update []roachpb.SpanConfigEntry, delete []roachpb.Span,
nit: I don't like shadowing of the delete
builtin function. Maybe toDelete
and toUpdate
? Also, my sense is the deletes conceptually precede the updates, can you put them first in the arg list?
pkg/config/zonepb/zone.go, line 1161 at r6 (raw file):
// fully hydrated. A fully hydrated zone configuration must have all required // fields set, which are RangeMaxBytes, RangeMinBytes, GC, and NumReplicas. func (z *ZoneConfig) EnsureFullyHydrated() error {
nit: you could accumulate all of the unset fields to make for a better error.nnn
pkg/config/zonepb/zone.go, line 1181 at r6 (raw file):
// SpanConfig. It fatals if the zone config hasn't been fully hydrated (fields // are expected to have been cascaded through parent zone configs). func (z *ZoneConfig) AsSpanConfig() roachpb.SpanConfig {
this is only used in testing, how about TestingAsSpanConfig
?
pkg/migration/migrations/migrations.go, line 119 at r10 (raw file):
retryJobsWithExponentialBackoff), migration.NewTenantMigration( "add the system.span_configurations table",
maybe add to system tenant
?
pkg/roachpb/api.go, line 1633 at r10 (raw file):
// Unnest is a convenience method to return the slice-of-slices form of // GetSpanConfigsResponse.
nit: Note here that this is not a copy of the underlying []SpanConfigEntry
slices
pkg/roachpb/api.go, line 1635 at r10 (raw file):
// GetSpanConfigsResponse. func (r *GetSpanConfigsResponse) Unnest() [][]SpanConfigEntry { var ret [][]SpanConfigEntry
nit: allocate this using make
so it's the right size
pkg/roachpb/api.go, line 1668 at r10 (raw file):
} for i := range list {
the runtime of this is unfortunate. Given these things really should not overlap, can we just make this an O(n log n)
thing by using O(n)
extra memory? At least leave a TODO
pkg/roachpb/api.go, line 1675 at r10 (raw file):
if list[i].Overlaps(list[j]) { return errors.AssertionFailedf("overlapping spans %s and %s in same list",
I sort of envisioned that if you've got an existing span that you want to split then we'd delete the old one and upsert the new ones.
pkg/roachpb/api.proto, line 2452 at r10 (raw file):
Value content = 2 [(gogoproto.nullable) = false]; // Which pattern does this gossip information match? string pattern_matched = 3; // If non-nil, the other fields will be empty and this will be the final event
Accidental removal of a newline?
pkg/roachpb/api.proto, line 2569 at r10 (raw file):
// SpanConfig holds the configuration that applies to a given keyspan. It // parallels the definition found in zonepb/zone.proto. message SpanConfig {
Can you move all of this to a new proto file in this package?
pkg/roachpb/api.proto, line 2661 at r10 (raw file):
// SpanConfigsToUpdate lists out the spans we want to update and the configs // we want to update with. repeated SpanConfigEntry span_configs_to_update = 1 [(gogoproto.nullable) = false];
nit: I'd just call this to_update
(or to_upsert
assuming the semantics are what I think they are) as the type makes it clear what is being updated.
pkg/roachpb/api.proto, line 2661 at r10 (raw file):
// SpanConfigsToUpdate lists out the spans we want to update and the configs // we want to update with. repeated SpanConfigEntry span_configs_to_update = 1 [(gogoproto.nullable) = false];
Is it an update or is it an insert? Maybe we should call this span_configs_to_upsert
?
Can you describe the semantics of what happens in terms of overlap between entries in these slices and existing state in the store?
pkg/server/node.go, line 1487 at r10 (raw file):
sessiondata.InternalExecutorOverride{User: security.RootUserName()}, `SELECT start_key, end_key, config FROM system.span_configurations WHERE $1 < end_key AND start_key < $2`,
Can you look at this query plan? The query planner won't know anything about our non-overlapping invariant. It would be bad if this is a table scan.
Honestly, it's not good that this is so many round-trips anyway.
pkg/server/node.go, line 1577 at r10 (raw file):
datums, err := n.sqlExec.QueryRowEx(ctx, "validate-span-cfgs", txn, sessiondata.InternalExecutorOverride{User: security.RootUserName()}, `SELECT count(*) FROM system.span_configurations WHERE $1 < end_key AND start_key < $2`,
What's the query plan on this? This is going to be a table scan, no? That seems unfortunate and sort of not okay, it would mean that updating a zone configs is O(N^2) where N is the number of span configs in the entire cluster.
pkg/spanconfig/spanconfig.go, line 45 at r10 (raw file):
// callers are to issue a delete for the previous span and updates for the // new ones. UpdateSpanConfigEntries(ctx context.Context, update []roachpb.SpanConfigEntry, delete []roachpb.Span) error
same nit on the name here, please don't call it delete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @arulajmani, @irfansharif, and @nvanbenschoten)
pkg/server/node.go, line 1442 at r12 (raw file):
// GetSpanConfigEntriesFor implements the spanconfig.KVAccessor interface. It's a // convenience wrapper around the RPC implementation. func (n *Node) GetSpanConfigEntriesFor(
It feels more or less insane to me that this functionality is sitting right here directly on Node
. Can you please implement these methods on an interface that Node
constructs?
Another thing I'll say is that I've found that it's useful to allow that interface to take a table name or, alternatively, not specify a database when writing out the sql queries. Then, without starting up a new server, you can trivially test operations and not have to worry about using the real table.
Like, I'd expect a:
package spanconfig
type Reader interface {
GetOverlapping(ctx context.Context, overlapping []roachpb.Span) ([][]roachpb.SpanConfigEntry, error)
}
type Writer interface {
Update(/*...*/) /*(...)*/
}
type Storage interface {
Reader
Writer
}
// then something that takes a txn and gives you storage
pkg/sql/catalog/systemschema/system.go, line 538 at r12 (raw file):
CREATE TABLE system.span_configurations ( start_key BYTES NOT NULL PRIMARY KEY, end_key BYTES NOT NULL,
How do you feel about CHECK end_key >= start_key
?
a31ae0b
to
e8d69b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable is down for so can't respond inline to comments, but addressed everything; PTAL. Thanks for helping shepherd this through -- the queries no longer have round trips == # of spans in request and all use constrained index scans. I suspect this KVAccessor interface could benefit with some form of randomized testing that I haven't typed out -- perhaps something we can get to during stability.
Build succeeded: |
i'm glad you managed to pass it on haha, i'm on 4 now! |
Part of cockroachdb#67679. We'll hide `config.SystemConfig` behind the `spanconfig.StoreReader` interface, and use that instead in the various queues that need access to the system config span. In future PRs we'll introduce a data structure that maintains a mapping between spans and configs that implements this same interface, except it will be powered by a view over `system.span_configurations` that was introduced in \cockroachdb#69047. When we do make that switch, i.e. have KV consult this new thing for splits, merges, GC and replication, instead of the gossip backed system config span, ideally it'd be as easy as swapping the source. This PR helps pave the way for just that. In cockroachdb#66348 we described how `zonepb.ZoneConfigs` going forward were going to be an exclusively SQL-level construct. Consequently we purge[*] all usages of it in KV, storing on each replica a `roachpb.SpanConfig` instead. [*]: The only remaining use is what powers our replication reports, which does not extend well to multi-tenancy in its current form and needs replacing. Release note: None Release justification: low risk, high benefit changes to existing functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 7 files at r21, 5 of 11 files at r22, 1 of 1 files at r24, 7 of 7 files at r25, 1 of 1 files at r26, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/roachpb/span_config.proto, line 19 at r20 (raw file):
Previously, irfansharif (irfan sharif) wrote…
I'll send a PR for this shortly, I tried appending it here but it got a bit hairy so I'd like to decouple it for now.
👍
Part of cockroachdb#67679. We'll hide `config.SystemConfig` behind the `spanconfig.StoreReader` interface, and use that instead in the various queues that need access to the system config span. In future PRs we'll introduce a data structure that maintains a mapping between spans and configs that implements this same interface, except it will be powered by a view over `system.span_configurations` that was introduced in \cockroachdb#69047. When we do make that switch, i.e. have KV consult this new thing for splits, merges, GC and replication, instead of the gossip backed system config span, ideally it'd be as easy as swapping the source. This PR helps pave the way for just that. In cockroachdb#66348 we described how `zonepb.ZoneConfigs` going forward were going to be an exclusively SQL-level construct. Consequently we purge[*] all usages of it in KV, storing on each replica a `roachpb.SpanConfig` instead. [*]: The only remaining use is what powers our replication reports, which does not extend well to multi-tenancy in its current form and needs replacing. Release note: None Release justification: low risk, high benefit changes to existing functionality
Part of cockroachdb#67679. We'll hide `config.SystemConfig` behind the `spanconfig.StoreReader` interface, and use that instead in the various queues that need access to the system config span. In future PRs we'll introduce a data structure that maintains a mapping between spans and configs that implements this same interface, except it will be powered by a view over `system.span_configurations` that was introduced in \cockroachdb#69047. When we do make that switch, i.e. have KV consult this new thing for splits, merges, GC and replication, instead of the gossip backed system config span, ideally it'd be as easy as swapping the source. This PR helps pave the way for just that. In cockroachdb#66348 we described how `zonepb.ZoneConfigs` going forward were going to be an exclusively SQL-level construct. Consequently we purge[*] all usages of it in KV, storing on each replica a `roachpb.SpanConfig` instead. [*]: The only remaining use is what powers our replication reports, which does not extend well to multi-tenancy in its current form and needs replacing. Release note: None Release justification: low risk, high benefit changes to existing functionality
Part of cockroachdb#67679. We'll hide `config.SystemConfig` behind the `spanconfig.StoreReader` interface, and use that instead in the various queues that need access to the system config span. In future PRs we'll introduce a data structure that maintains a mapping between spans and configs that implements this same interface, except it will be powered by a view over `system.span_configurations` that was introduced in \cockroachdb#69047. When we do make that switch, i.e. have KV consult this new thing for splits, merges, GC and replication, instead of the gossip backed system config span, ideally it'd be as easy as swapping the source. This PR helps pave the way for just that. In cockroachdb#66348 we described how `zonepb.ZoneConfigs` going forward were going to be an exclusively SQL-level construct. Consequently we purge[*] all usages of it in KV, storing on each replica a `roachpb.SpanConfig` instead. [*]: The only remaining use is what powers our replication reports, which does not extend well to multi-tenancy in its current form and needs replacing. Release note: None Release justification: low risk, high benefit changes to existing functionality
69172: kvserver: abstract away system config span usage in kv r=irfansharif a=irfansharif Part of #67679. We'll hide config.SystemConfig behind the spanconfig.StoreReader interface, and use that instead in the various queues that need access to the system config span. In future PRs we'll introduce a data structure that maintains a mapping between spans and configs that implements this same interface. This will be powered by a view of `system.span_configurations`, following the ideas described in \#66348. When we do make that switch, i.e. have KV consult the new thing for splits, merges, GC and replication, instead of the gossip backed system config span, ideally it'd be as easy as swapping the source. This PR helps pave the way for just that. In \#66348 we described how zonepb.ZoneConfigs going forward were going to be an exclusively SQL-level construct. Consequently we purge[*] all usages of it in KV, storing on each replica a roachpb.SpanConfig instead. [*]: The only remaining use is what powers our replication reports, which does not extend well to multi-tenancy and needs replacing. Release note: None (First two commits are from #69047; ignore here) Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
69172: kvserver: abstract away system config span usage in kv r=irfansharif a=irfansharif Part of #67679. We'll hide `config.SystemConfig` behind the `spanconfig.StoreReader` interface, and use that instead in the various queues that need access to the system config span. In future PRs we'll introduce a data structure that maintains a mapping between spans and configs that implements this same interface, except it will be powered by a view over `system.span_configurations` that was introduced in \#69047. When we do make that switch, i.e. have KV consult this new thing for splits, merges, GC and replication, instead of the gossip backed system config span, ideally it'd be as easy as swapping the source. This PR helps pave the way for just that. In #66348 we described how `zonepb.ZoneConfigs` going forward were going to be an exclusively SQL-level construct. Consequently we purge[*] all usages of it in KV, storing on each replica a `roachpb.SpanConfig` instead. [*]: The only remaining use is what powers our replication reports, which does not extend well to multi-tenancy in its current form and needs replacing. Release note: None Release justification: low risk, high benefit changes to existing functionality 69515: kvserver: minor test improvement r=andreimatei a=andreimatei Remove an unncecessary clock. Release note: None Release justification: test only Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com> Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Cluster settings are too easy a switch to reach for to enable the new span configs machinery. Let's gate it behind a necessary envvar as well and use cluster settings to selectively toggle individual components. This commit also fixes a mighty silly bug introduced in cockroachdb#69047; for the two methods we intended to gate use `spanconfig.experimental_kvaccessor.enabled`, we were checking the opposite condition or not checking it at all. Oops. Release note: None Release justification: non-production code changes
69658: spanconfig: disable infrastructure unless envvar is set r=irfansharif a=irfansharif Cluster settings are too easy a switch to reach for to enable the new span configs machinery. Let's gate it behind a necessary envvar as well and use cluster settings to selectively toggle individual components. This commit also fixes a mighty silly bug introduced in #69047; for the two methods we intended to gate use `spanconfig.experimental_kvaccessor.enabled`, we were checking the opposite condition or not checking it at all. Oops. Release note: None Release justification: non-production code changes 69809: kv/kvserver: use proper formatting when debug printing intents r=AlexTalks a=AlexTalks This commit changes the formatting used when printing intents via the CLI debug command from the default generated Protobuf formatter to our custom `MVCCMetadata` formatter implementation. Additionally, the `MergeTimestamp` and `TxnDidNotUpdateMetadata` fields have been added to the output. This changes the debug formatting from the former example: ``` 0,0 /Local/RangeID/203/r/RangePriorReadSummary (0x0169f6cb727270727300): {Txn:<nil> Timestamp:0,0 Deleted:false KeyBytes:0 ValBytes:0 RawBytes:[230 123 85 161 3 10 12 10 10 8 146 229 195 204 139 135 186 208 22 18 12 10 10 8 146 229 195 204 139 135 186 208 22] IntentHistory:[] Me rgeTimestamp:<nil> TxnDidNotUpdateMeta:<nil>} /Local/Lock/Intent/Table/56/1/1319/6/3055/0 0361fea07d3f0d40ba8f44dc4ee46cbdc2 (0x017a6b12c089f705278ef70bef880001000361fea07d3f0d40ba8f44dc4ee46cbdc212): 1630559399.176502568,0 {Txn:id=61fea07d key=/Table/57/1/1319/6/0 pri=0.01718258 epo=0 ts=1630559399.176502568,0 min=1630559399.176502568,0 seq=4 Timestamp:1630559399.176502568,0 Deleted:false KeyBytes:12 ValBytes:5 RawBytes:[] IntentHistory:[] MergeTimestamp:<nil> TxnDidNotUpdateMeta:0xc0016059b0} ``` to the following example: ``` 0,0 /Local/RangeID/203/r/RangePriorReadSummary (0x0169f6cb727270727300): txn={<nil>} ts=0,0 del=false klen=0 vlen=0 raw=/BYTES/0x0a0c0a0a0892e5c3cc8b87bad016120c0a0a0892e5c3cc8b87bad016 mergeTs=<nil> txnDidNotUpdateMeta=false /Local/Lock/Intent/Table/56/1/1319/6/3055/0 0361fea07d3f0d40ba8f44dc4ee46cbdc2 (0x017a6b12c089f705278ef70bef880001000361fea07d3f0d40ba8f44dc4ee46cbdc212): 1630559399.176502568,0 txn={id=61fea07d key=/Table/57/1/1319/6/0 pri=0.01718258 epo=0 ts=1630559399.176502568,0 min=1630559399.176502568,0 seq=4} ts=1630559399.176502568,0 del=false klen=12 vlen=5 mergeTs=<nil> txnDidNotUpdateMeta=true ``` Related to #69414 Release justification: Bug fix Release note: None Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com> Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
Cluster settings are too easy a switch to reach for to enable the new span configs machinery. Let's gate it behind a necessary envvar as well and use cluster settings to selectively toggle individual components. This commit also fixes a mighty silly bug introduced in #69047; for the two methods we intended to gate use `spanconfig.experimental_kvaccessor.enabled`, we were checking the opposite condition or not checking it at all. Oops. Release note: None Release justification: non-production code changes
In cockroachdb#69172 we introduced a spanconfig.StoreReader interface to abstract away the gossiped system config span. We motivated that PR by teasing a future implementation of the same interface, an in-memory data structure to maintain a mapping between between spans and configs (powered through a view over system.span_configurations introduced in \cockroachdb#69047). This PR introduces just that. Intended (future) usages: - cockroachdb#69614 introduces the KVWatcher interface, listening in on system.span_configurations. The updates generated by it will be used to populate per-store instantiations of this data structure, with an eye towards providing a "drop-in" replacement of the gossiped system config span (conveniently implementing the sibling spanconfig.StoreReader interface). - cockroachdb#69661 introduces the SQLWatcher interface, listening in on changes to system.{descriptor,zones} and generating denormalized span config updates for every descriptor/zone config change. These updates will need to be diffed against a spanconfig.StoreWriter populated with the existing contents of KVAccessor to generate the "targeted" diffs KVAccessor expects. Release note: None
In cockroachdb#69172 we introduced a spanconfig.StoreReader interface to abstract away the gossiped system config span. We motivated that PR by teasing a future implementation of the same interface, an in-memory data structure to maintain a mapping between between spans and configs (powered through a view over system.span_configurations introduced in \cockroachdb#69047). This PR introduces just that. Intended (future) usages: - cockroachdb#69614 introduces the KVWatcher interface, listening in on system.span_configurations. The updates generated by it will be used to populate per-store instantiations of this data structure, with an eye towards providing a "drop-in" replacement of the gossiped system config span (conveniently implementing the sibling spanconfig.StoreReader interface). - cockroachdb#69661 introduces the SQLWatcher interface, listening in on changes to system.{descriptor,zones} and generating denormalized span config updates for every descriptor/zone config change. These updates will need to be diffed against a spanconfig.StoreWriter populated with the existing contents of KVAccessor to generate the "targeted" diffs KVAccessor expects. Release note: None
Cluster settings are too easy a switch to reach for to enable the new span configs machinery. Let's gate it behind a necessary envvar as well and use cluster settings to selectively toggle individual components. This commit also fixes a mighty silly bug introduced in cockroachdb#69047; for the two methods we intended to gate use `spanconfig.experimental_kvaccessor.enabled`, we were checking the opposite condition or not checking it at all. Oops. Release note: None Release justification: non-production code changes
In cockroachdb#69172 we introduced a spanconfig.StoreReader interface to abstract away the gossiped system config span. We motivated that PR by teasing a future implementation of the same interface, an in-memory data structure to maintain a mapping between between spans and configs (powered through a view over system.span_configurations introduced in \cockroachdb#69047). This PR introduces just that. Intended (future) usages: - cockroachdb#69614 introduces the KVWatcher interface, listening in on system.span_configurations. The updates generated by it will be used to populate per-store instantiations of this data structure, with an eye towards providing a "drop-in" replacement of the gossiped system config span (conveniently implementing the sibling spanconfig.StoreReader interface). - cockroachdb#69661 introduces the SQLWatcher interface, listening in on changes to system.{descriptor,zones} and generating denormalized span config updates for every descriptor/zone config change. These updates will need to be diffed against a spanconfig.StoreWriter populated with the existing contents of KVAccessor to generate the "targeted" diffs KVAccessor expects. Release note: None
70287: spanconfig: introduce spanconfig.StoreWriter (and its impl) r=irfansharif a=irfansharif In #69172 we introduced a spanconfig.StoreReader interface to abstract away the gossiped system config span. We motivated that PR by teasing a future implementation of the same interface, an in-memory data structure to maintain a mapping between between spans and configs (powered through a view over system.span_configurations introduced in [#69047](69047)). This PR introduces just that. Intended (future) usages: - [#69614](69614) introduces the KVWatcher interface, listening in on system.span_configurations. The updates generated by it will be used to populate per-store instantiations of this data structure, with an eye towards providing a "drop-in" replacement of the gossiped system config span (conveniently implementing the sibling spanconfig.StoreReader interface). - [#69661](69661) introduces the SQLWatcher interface, listening in on changes to system.{descriptor,zones} and generating denormalized span config updates for every descriptor/zone config change. These updates will need to be diffed against a spanconfig.StoreWriter populated with the existing contents of KVAccessor to generate the "targeted" diffs KVAccessor expects. Release note: None Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
Part of #67679; these RPCs are exposed through the
kvtenant.Connector
interface for tenants and also sit on
pkg/server.(*Node)
for the hosttenant. The basic type in these RPCs is the
SpanConfig
proto, which isthe same as our existing
ZoneConfig
proto but without any inheritancebusiness.
The RPCs are backed by a new host-tenant only
system.span_configurations
table. Future PRs will wire a view of thistable into KV with an eye towards replacing our use of
config.SystemConfig
.While here, we also introduce a
crdb_internal.pretty_span
builtin tohelp with the readability of this table. In future PRs we'll make use of
this built-in for datadriven tests asserting on the state of the table.
Release note (sql change): We've added a
system.span_configurations
table. This will later be used to store authoritative span configs that
KV has decided to apply.
Release justification: non-production code changes