-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv/kvclient: introduce new tenant Proxy #50520
kv/kvclient: introduce new tenant Proxy #50520
Conversation
81cbdc2
to
019b51d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. Some comments on how the proxy dials but overall this looks great.
Reviewed 2 of 2 files at r1, 16 of 16 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/kv/kvclient/kvtenant/proxy.go, line 67 at r2 (raw file):
// NewProxy creates a new Proxy. func NewProxy(
rpc.Context has a naked Stopper as of #50438, in case you're interested in using that instead.
pkg/kv/kvclient/kvtenant/proxy.go, line 102 at r2 (raw file):
client, err := p.dial(ctx) if err != nil { log.Warningf(ctx, "error dialing tenant KV address: %v", err)
If you switched to the singleton-dialing approach I advocated for, you wouldn't have to log anything in this method. Instead, you would log in the centralized loop - would be clear that if you can't contact KV, not much else will work.
pkg/kv/kvclient/kvtenant/proxy.go, line 131 at r2 (raw file):
nodeDescs := make(map[roachpb.NodeID]*roachpb.NodeDescriptor, len(resp.Descriptors)) for _, desc := range resp.Descriptors { nodeDescs[desc.NodeID] = desc
Intuitively I would have expected the stream to contain updates for individual nodes (i.e. delta-based), but it always sends a whole view of the cluster nodes.
I think this is just because it's easier and that's fine, perhaps add a comment.
pkg/kv/kvclient/kvtenant/proxy.go, line 187 at r2 (raw file):
// FirstRange implements the kvcoord.RangeDescriptorDB interface. func (p *Proxy) FirstRange() (*roachpb.RangeDescriptor, error) { return nil, errors.New("kvtenant.Proxy does not have access to FirstRange")
Any plans for getting rid of this wart? The only call is
desc, err := rdc.db.FirstRange() |
Seems a little annoying to get rid of, just wanted to know what your plans were, if any.
pkg/kv/kvclient/kvtenant/proxy.go, line 190 at r2 (raw file):
} func (p *Proxy) dial(ctx context.Context) (roachpb.InternalClient, error) {
nit: the methods in this file had lined up nicely for review, but feels like dial
ought to be further up, below where it's first used. (with dialAddr
following suit).
pkg/kv/kvclient/kvtenant/proxy.go, line 203 at r2 (raw file):
// expect the proxy to be pointed at a load balancer which will perform // internal health checks and prioritize healthy destinations, so maybe // that's not worth it.
The one question I have for you is whether the conn should be sticky. Right now, it seems like we're picking a random node for each call to dial
, I would've thought we hold a *rpc.Connection
around and dial
just calls conn.Connect(ctx)
and on error, makes a new round-robin connection:
const key = "dial"
v, err, _ := p.singleFlight.Do(key, func() (interface{}, error) {
// TODO put retry loop in here.
conn, err := p.dialAddr(ctx, p.randomAddr())
_, err := conn.Connect(ctx) // TODO use new ctx with timeout
if err != nil {
return nil, err
}
return conn, nil
})
if err != nil {
return nil, err
}
cc, err := v.(*rpc.Connection).Connect(ctx)
if err != nil {
if ctx.Err() == nil {
// TODO: multiple callers may end up here so one of the Forget
// calls may evict a newer, healthy, conn. Should make it such
// that only the first caller to get an error gets to call Forget.
// May not be worth using singleflight once we do that.
p.singleFlight.Forget("key")
}
return nil, err
}
return cc, nil
}
As is, if we point tenants at a KV-wide LB, won't we end up with each tenant connecting to all of the KV nodes? That seems a little silly, should "just" pick one and stick with it until it goes away.
This also answers the breaker question - by limiting to a single dial at any given point in time as in my pseudo-Go above, we won't log spam. Also, due to the sticky connection, once we have a healthy node we won't try the unhealthy node any more, so it should self-stabilize quite well.
pkg/roachpb/api.proto, line 2101 at r2 (raw file):
// SQL-only processes to have access to StoreDescriptors is going away // with #49997 and the tenant ZoneConfig updates might be better serviced // with a polling mechanism, so I specialized the request. I'm curious
Why do you consider polling a better option for zone config updates? Zones are and will be distributed via Gossip internally in KV, right? If it folds in seamlessly with this RPC, I think we might as well do it.
pkg/server/node.go, line 954 at r2 (raw file):
args *roachpb.RangeFeedRequest, stream roachpb.Internal_RangeFeedServer, ) error { growstack.Grow()
drive-by?
pkg/server/server_sql.go, line 131 at r2 (raw file):
externalStorageFromURI cloud.ExternalStorageFromURIFactory // TODO(nvanbenschoten): Should this move to a second "optional" args
I don't have a strong urge to change this (other than the cleanups you've identified above and getting rid of some dependencies we don't want to have), but am open to it if you feel that it would help keep this more understandable.
pkg/server/testserver.go, line 498 at r2 (raw file):
nodeDialer := nodedialer.New( rpcContext, gossip.AddressResolver(g), // TODO(nvb): break gossip dep
🎆
@@ -0,0 +1,228 @@ | |||
// Copyright 2020 The Cockroach Authors. | |||
// | |||
// Use of this software is governed by the Business Source License |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So close to being able to make this CCL, but not worth it right now.
2d954d6
to
1d3c66d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR! The big change since you last saw this is that I've replaced the NodeInfo
RPC with a more general GossipSubscription
stream. This should resolve the discussion we were having below. PTAL.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
pkg/kv/kvclient/kvtenant/proxy.go, line 3 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
So close to being able to make this CCL, but not worth it right now.
Could you remind me again why we can't do this now? Just because we'd need to pull this into a pkg/ccl/kvccl/kvtenant
package?
pkg/kv/kvclient/kvtenant/proxy.go, line 67 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
rpc.Context has a naked Stopper as of #50438, in case you're interested in using that instead.
Done.
pkg/kv/kvclient/kvtenant/proxy.go, line 102 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
If you switched to the singleton-dialing approach I advocated for, you wouldn't have to log anything in this method. Instead, you would log in the centralized loop - would be clear that if you can't contact KV, not much else will work.
Done.
pkg/kv/kvclient/kvtenant/proxy.go, line 131 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Intuitively I would have expected the stream to contain updates for individual nodes (i.e. delta-based), but it always sends a whole view of the cluster nodes.
I think this is just because it's easier and that's fine, perhaps add a comment.
Yes, this was because it was easier to implement. A marshaled NodeDescriptor is on the order of 100 bytes (depending on the size of variable-length strings), so with a 512 node cluster, that's only about 50KB whenever a node joins or leaves the cluster.
But I've reworked this (see the discussion below), which naturally allowed this to become delta-based. So done.
pkg/kv/kvclient/kvtenant/proxy.go, line 187 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Any plans for getting rid of this wart? The only call is
desc, err := rdc.db.FirstRange()
Seems a little annoying to get rid of, just wanted to know what your plans were, if any.
No, I didn't have any plans to get rid of it. The RangeCache
should never be requesting the FirstRange in a tenant process because it will only need to if it has already recursed up the meta range tree, which wouldn't be possible with this implementation of RangeDescriptorDB.
I guess that indicates that we should pull FirstRange
out of the RangeDescriptorDB
interface entirely and push the keys.RangeMetaKey(key).Equal(roachpb.RKeyMin)
conditional logic into DistSender
's implementation of RangeDescriptorDB
. What do you think?
pkg/kv/kvclient/kvtenant/proxy.go, line 190 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
nit: the methods in this file had lined up nicely for review, but feels like
dial
ought to be further up, below where it's first used. (withdialAddr
following suit).
Hm, not totally sure I agree. I think it's nice to separate the business logic from the networking logic.
pkg/kv/kvclient/kvtenant/proxy.go, line 203 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
The one question I have for you is whether the conn should be sticky. Right now, it seems like we're picking a random node for each call to
dial
, I would've thought we hold a*rpc.Connection
around anddial
just callsconn.Connect(ctx)
and on error, makes a new round-robin connection:const key = "dial" v, err, _ := p.singleFlight.Do(key, func() (interface{}, error) { // TODO put retry loop in here. conn, err := p.dialAddr(ctx, p.randomAddr()) _, err := conn.Connect(ctx) // TODO use new ctx with timeout if err != nil { return nil, err } return conn, nil }) if err != nil { return nil, err } cc, err := v.(*rpc.Connection).Connect(ctx) if err != nil { if ctx.Err() == nil { // TODO: multiple callers may end up here so one of the Forget // calls may evict a newer, healthy, conn. Should make it such // that only the first caller to get an error gets to call Forget. // May not be worth using singleflight once we do that. p.singleFlight.Forget("key") } return nil, err } return cc, nil }
As is, if we point tenants at a KV-wide LB, won't we end up with each tenant connecting to all of the KV nodes? That seems a little silly, should "just" pick one and stick with it until it goes away.
This also answers the breaker question - by limiting to a single dial at any given point in time as in my pseudo-Go above, we won't log spam. Also, due to the sticky connection, once we have a healthy node we won't try the unhealthy node any more, so it should self-stabilize quite well.
Very good point. We can't use a singleFlight group quite like your example wants because the group doesn't maintain state after the requst completes, but I updated the code with this as inspiration. We now cache a single InternalClient and use it as long as possible across all clients of the proxy. PTAL.
pkg/roachpb/api.proto, line 2101 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Why do you consider polling a better option for zone config updates? Zones are and will be distributed via Gossip internally in KV, right? If it folds in seamlessly with this RPC, I think we might as well do it.
I was struggling to find the right abstraction here for the packaging of node descriptors and a single specific zone config over a shared stream. I've decided to generalize this interface significantly as a way to make it more useful and also more easily applicable here.
The result is replacing NodeInfo
with a more general GossipSubscription
RPC, which is defined as:
// GossipSubscriptionRequest initiates a game of telephone. It establishes an
// indefinite stream that proxies gossip information overheard by the recipient
// node back to the caller. Gossip information is filtered down to just those
// identified by a key matching any of the specified patterns.
//
// Upon establishment of the stream, all existing information that matches one
// or more of the patterns is returned. After this point, only new information
// matching the patterns is returned.
message GossipSubscriptionRequest {
repeated string patterns = 1;
}
// GossipSubscriptionEvent is a single piece of proxied gossip information.
message GossipSubscriptionEvent {
string key = 1;
Value content = 2 [(gogoproto.nullable) = false];
// Which pattern does this gossip information match?
string pattern_matched = 3;
// If non-nil, the other fields will be empty and this will be the final event
// send on the stream before it is terminated.
Error error = 4;
}
The idea is that we then put access control policies on the specific patterns that a tenant process is able to request and then let clients request whichever patterns they want. So instead of creating a separate RPC for each subset of the gossip network information that we export, we can have clients request multiple patterns and validate that the patterns requested are permitted. I expect that the first two patterns allowed will be:
- "node:.*"
- "system-db:zones/1/tenants"
One nice thing that falls out of this is that I was able to remove all of the gossip changes I had added in before and use the standard RegisterCallback
interface instead. Another nice fallout from this is that we naturally get a delta-based protocol instead of the original bulk replacement protocol.
A wart in the plan is that the SystemConfig
is stored under a single "system-db" key that contains a struct{vals []roachpb.KeyValue}
value. My plan is to add some notion of indexing/filtering into the interfaces that gossip provides so that callers can specify key prefixes that they want to hear updates about and resulting SystemConfig structs would be filtered down to just these keys. Effectively, virtualizing the existence of multiple gossip keys from the perspective of the callback. This is essentially what we're already doing with the SystemConfigDeltaFilter
in places like:
cockroach/pkg/sql/gcjob/refresh_statuses.go
Lines 276 to 279 in e813aee
k := execCfg.Codec.IndexPrefix(uint32(keys.ZonesTableID), uint32(keys.ZonesTablePrimaryIndexID)) | |
zoneCfgFilter := gossip.MakeSystemConfigDeltaFilter(k) | |
gossipUpdateC := execCfg.Gossip.DeprecatedRegisterSystemConfigChannel(47150) | |
return zoneCfgFilter, gossipUpdateC |
But we could now push this complexity into Gossip
itself. We'd then add a pattern like "system-db:zones/1/tenants"
to the allowlist of patterns that a tenant can request in a gossip subscription.
What do you think?
pkg/roachpb/api.proto, line 2053 at r4 (raw file):
repeated RangeDescriptor prefetched_descriptors = 2 [(gogoproto.nullable) = false]; // If non-nil, the other fields will be empty. Error error = 3;
I almost made the mistake of conflating RPC errors and logical errors, which made error handling in the proxy much more complicated. This is now fixed.
pkg/server/node.go, line 954 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
drive-by?
Yeah, this wasn't needed.
pkg/server/server_sql.go, line 131 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
I don't have a strong urge to change this (other than the cleanups you've identified above and getting rid of some dependencies we don't want to have), but am open to it if you feel that it would help keep this more understandable.
I'll keep the TODO (without the questions) but won't address it now.
1d3c66d
to
82267f2
Compare
@andy-kimball and I talked about this late last week and we decided that it would be a good idea to move the proxy implementation under the CCL license while we still can. Doing so was very straightforward, so I made the change here. PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caught up on the diffs, now to review the proxy connection code again (diff wasn't visible because of the move to CCL)
Reviewed 15 of 15 files at r5, 18 of 18 files at r6, 4 of 4 files at r7.
Dismissed @nvanbenschoten from a discussion.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)
pkg/kv/kvclient/kvtenant/proxy.go, line 187 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
No, I didn't have any plans to get rid of it. The
RangeCache
should never be requesting the FirstRange in a tenant process because it will only need to if it has already recursed up the meta range tree, which wouldn't be possible with this implementation of RangeDescriptorDB.I guess that indicates that we should pull
FirstRange
out of theRangeDescriptorDB
interface entirely and push thekeys.RangeMetaKey(key).Equal(roachpb.RKeyMin)
conditional logic intoDistSender
's implementation ofRangeDescriptorDB
. What do you think?
Yes, this was the thought I also had. Let's leave it for a follow-up though, I'm already a bit lost in this one.
pkg/roachpb/api.proto, line 2101 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I was struggling to find the right abstraction here for the packaging of node descriptors and a single specific zone config over a shared stream. I've decided to generalize this interface significantly as a way to make it more useful and also more easily applicable here.
The result is replacing
NodeInfo
with a more generalGossipSubscription
RPC, which is defined as:// GossipSubscriptionRequest initiates a game of telephone. It establishes an // indefinite stream that proxies gossip information overheard by the recipient // node back to the caller. Gossip information is filtered down to just those // identified by a key matching any of the specified patterns. // // Upon establishment of the stream, all existing information that matches one // or more of the patterns is returned. After this point, only new information // matching the patterns is returned. message GossipSubscriptionRequest { repeated string patterns = 1; } // GossipSubscriptionEvent is a single piece of proxied gossip information. message GossipSubscriptionEvent { string key = 1; Value content = 2 [(gogoproto.nullable) = false]; // Which pattern does this gossip information match? string pattern_matched = 3; // If non-nil, the other fields will be empty and this will be the final event // send on the stream before it is terminated. Error error = 4; }The idea is that we then put access control policies on the specific patterns that a tenant process is able to request and then let clients request whichever patterns they want. So instead of creating a separate RPC for each subset of the gossip network information that we export, we can have clients request multiple patterns and validate that the patterns requested are permitted. I expect that the first two patterns allowed will be:
- "node:.*"
- "system-db:zones/1/tenants"
One nice thing that falls out of this is that I was able to remove all of the gossip changes I had added in before and use the standard
RegisterCallback
interface instead. Another nice fallout from this is that we naturally get a delta-based protocol instead of the original bulk replacement protocol.A wart in the plan is that the
SystemConfig
is stored under a single "system-db" key that contains astruct{vals []roachpb.KeyValue}
value. My plan is to add some notion of indexing/filtering into the interfaces that gossip provides so that callers can specify key prefixes that they want to hear updates about and resulting SystemConfig structs would be filtered down to just these keys. Effectively, virtualizing the existence of multiple gossip keys from the perspective of the callback. This is essentially what we're already doing with theSystemConfigDeltaFilter
in places like:
cockroach/pkg/sql/gcjob/refresh_statuses.go
Lines 276 to 279 in e813aee
k := execCfg.Codec.IndexPrefix(uint32(keys.ZonesTableID), uint32(keys.ZonesTablePrimaryIndexID)) zoneCfgFilter := gossip.MakeSystemConfigDeltaFilter(k) gossipUpdateC := execCfg.Gossip.DeprecatedRegisterSystemConfigChannel(47150) return zoneCfgFilter, gossipUpdateC But we could now push this complexity into
Gossip
itself. We'd then add a pattern like"system-db:zones/1/tenants"
to the allowlist of patterns that a tenant can request in a gossip subscription.What do you think?
Nice.
pkg/server/node.go, line 1027 at r6 (raw file):
// RegisterSystemConfigChannel for any SystemConfig patterns. // NB: all callbacks are run on a single goroutine.
Unless this is a well-documented feature of gossip callbacks, I would rather not rely on this and instead close over a loop-local mutex that ensures sequential execution. Reading this makes me feel obligated to go and check whether it's (still) true.
pkg/server/node.go, line 1058 at r6 (raw file):
// The consumer was not keeping up with gossip updates, so its // subscription was terminated to avoid blocking gossip. err := roachpb.NewErrorf("subscription terminated due to slow consumption")
Is this worth logging? It will be logged at the SQL node, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
82267f2
to
24a804e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
I had to move the 3node-tenant
logic test configuration to ccl/logictestccl
in order to support the proxy movement, but I think it actually came out nicely.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
pkg/kv/kvclient/kvtenant/proxy.go, line 187 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Yes, this was the thought I also had. Let's leave it for a follow-up though, I'm already a bit lost in this one.
Added a TODO.
pkg/server/node.go, line 1027 at r6 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Unless this is a well-documented feature of gossip callbacks, I would rather not rely on this and instead close over a loop-local mutex that ensures sequential execution. Reading this makes me feel obligated to go and check whether it's (still) true.
Done.
pkg/server/node.go, line 1058 at r6 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Is this worth logging? It will be logged at the SQL node, right?
Done. It will also be logged at the SQL node.
24a804e
to
22d86b2
Compare
5d359e7
to
4280339
Compare
Fixes cockroachdb#47909. This commit starts by adding two RPCs to the Internal service: ``` service Internal { ... rpc RangeFeed (RangeFeedRequest) returns (stream RangeFeedEvent) {} rpc GossipSubscription (GossipSubscriptionRequest) returns (stream GossipSubscriptionEvent) {} } // RangeLookupRequest is a request to proxy a RangeLookup through a Tenant // service. Its fields correspond to a subset of the args of kv.RangeLookup. message RangeLookupRequest { ... } // GossipSubscriptionRequest initiates a game of telephone. It establishes an // indefinite stream that proxies gossip information overheard by the recipient // node back to the caller. Gossip information is filtered down to just those // identified by a key matching any of the specified patterns. // // Upon establishment of the stream, all existing information that matches one // or more of the patterns is returned. After this point, only new information // matching the patterns is returned. message GossipSubscriptionRequest { ... } ``` The commit then introduces new `kvtenant.Proxy` object. Proxy mediates the communication of cluster-wide state to sandboxed SQL-only tenant processes through a restricted interface. A Proxy is seeded with a set of one or more network addresses that reference existing KV nodes in the cluster (or a load-balancer which fans out to some/all KV nodes). On startup, it establishes contact with one of these nodes to learn about the topology of the cluster and bootstrap the rest of SQL <-> KV network communication. Proxy has two main roles: First, Proxy is capable of providing information on each of the KV nodes in the cluster in the form of NodeDescriptors. This obviates the need for SQL-only tenant processes to join the cluster-wide gossip network. In doing so, it satisfies the `NodeDescStore` interface and can be used as an `AddressResolver` with a small adapter. Second, Proxy is capable of providing Range addressing information in the form of RangeDescriptors through delegated RangeLookup requests. This is necessary because SQL-only tenants are restricted from reading Range Metadata keys directly. Instead, the RangeLookup requests are proxied through existing KV nodes while being subject to additional validation (e.g. is the Range being requested owned by the requesting tenant?). In doing so, it satisfies the `RangeDescriptorDB` interface and can be used to delegate all DistSender/RangeCache descriptor lookups to KV nodes. With this commit, we can mostly run a SQL-only tenant process without joining the KV cluster's gossip network. This works if I comment out a few of the uses of gossip due to cockroachdb#49692 and cockroachdb#47150 in SQL. Notably, with the call to `DeprecatedRegisterSystemConfigChannel` in `sql.Server.Start` removed, I can remove `Gossip` from `makeSQLServerArgs` entirely and things "just work".
Package movement.
4280339
to
b19794d
Compare
bors r+ |
Build succeeded |
… server Fixes cockroachdb#49445. This commit introduces a new SystemConfigProvider abstraction, which is capable of providing the SystemConfig, as well as notifying clients of updates to the SystemConfig. Gossip already implements this interface. The commit then updates SQL to use this new dependency in place of Gossip whenever it needs to access the SystemConfig. After making this change, it then updates the kvtenant.Proxy to implement the new SystemConfigProvider interface. This is powered by the GossipSubscription RPC that was added in cockroachdb#50520. The commit updates the subscription to also match on the "system-db" gossip key, and just like that, it can provide the SystemConfig to SQL [*]. Finally, with the kvtenant.Proxy serving the role of a SystemConfigProvider to SQL when applicable, we're able to remove gossip entirely from StartTenant. SQL-only servers will no longer join the gossip network, which is a nice milestone for all of this work. [*] there are a few remaining questions about how exactly we want to enforce an access control policy on the system config gossip pattern. See the updated comment in `Node.GossipSubscription`. For now, we're just returning the entire SystemConfig to the subscription.
… server Fixes cockroachdb#49445. This commit introduces a new SystemConfigProvider abstraction, which is capable of providing the SystemConfig, as well as notifying clients of updates to the SystemConfig. Gossip already implements this interface. The commit then updates SQL to use this new dependency in place of Gossip whenever it needs to access the SystemConfig. After making this change, it then updates the kvtenant.Proxy to implement the new SystemConfigProvider interface. This is powered by the GossipSubscription RPC that was added in cockroachdb#50520. The commit updates the subscription to also match on the "system-db" gossip key, and just like that, it can provide the SystemConfig to SQL [*]. Finally, with the kvtenant.Proxy serving the role of a SystemConfigProvider to SQL when applicable, we're able to remove gossip entirely from StartTenant. SQL-only servers will no longer join the gossip network, which is a nice milestone for all of this work. [*] there are a few remaining questions about how exactly we want to enforce an access control policy on the system config gossip pattern. See the updated comment in `Node.GossipSubscription`. For now, we're just returning the entire SystemConfig to the subscription.
… server Fixes cockroachdb#49445. This commit introduces a new SystemConfigProvider abstraction, which is capable of providing the SystemConfig, as well as notifying clients of updates to the SystemConfig. Gossip already implements this interface. The commit then updates SQL to use this new dependency in place of Gossip whenever it needs to access the SystemConfig. After making this change, it then updates the kvtenant.Proxy to implement the new SystemConfigProvider interface. This is powered by the GossipSubscription RPC that was added in cockroachdb#50520. The commit updates the subscription to also match on the "system-db" gossip key, and just like that, it can provide the SystemConfig to SQL [*]. Finally, with the kvtenant.Proxy serving the role of a SystemConfigProvider to SQL when applicable, we're able to remove gossip entirely from StartTenant. SQL-only servers will no longer join the gossip network, which is a nice milestone for all of this work. [*] there are a few remaining questions about how exactly we want to enforce an access control policy on the system config gossip pattern. See the updated comment in `Node.GossipSubscription`. For now, we're just returning the entire SystemConfig to the subscription.
… server Fixes cockroachdb#49445. This commit introduces a new SystemConfigProvider abstraction, which is capable of providing the SystemConfig, as well as notifying clients of updates to the SystemConfig. Gossip already implements this interface. The commit then updates SQL to use this new dependency in place of Gossip whenever it needs to access the SystemConfig. After making this change, it then updates the kvtenant.Proxy to implement the new SystemConfigProvider interface. This is powered by the GossipSubscription RPC that was added in cockroachdb#50520. The commit updates the subscription to also match on the "system-db" gossip key, and just like that, it can provide the SystemConfig to SQL [*]. Finally, with the kvtenant.Proxy serving the role of a SystemConfigProvider to SQL when applicable, we're able to remove gossip entirely from StartTenant. SQL-only servers will no longer join the gossip network, which is a nice milestone for all of this work. [*] there are a few remaining questions about how exactly we want to enforce an access control policy on the system config gossip pattern. See the updated comment in `Node.GossipSubscription`. For now, we're just returning the entire SystemConfig to the subscription.
52034: kvtenant: implement SystemConfigProvider, remove Gossip from SQL-only server r=nvanbenschoten a=nvanbenschoten Fixes #49445. This commit introduces a new SystemConfigProvider abstraction, which is capable of providing the SystemConfig, as well as notifying clients of updates to the SystemConfig. Gossip already implements this interface. The commit then updates SQL to use this new dependency in place of Gossip whenever it needs to access the SystemConfig. After making this change, it then updates the kvtenant.Proxy to implement the new SystemConfigProvider interface. This is powered by the GossipSubscription RPC that was added in #50520. The commit updates the subscription to also match on the "system-db" gossip key, and just like that, it can provide the SystemConfig to SQL [*]. Finally, with the kvtenant.Proxy serving the role of a SystemConfigProvider to SQL when applicable, we're able to remove gossip entirely from StartTenant. SQL-only servers will no longer join the gossip network, which is a nice milestone for all of this work. [*] there are a few remaining questions about how exactly we want to enforce an access control policy on the system config gossip pattern. See the updated comment in `Node.GossipSubscription`. For now, we're just returning the entire SystemConfig to the subscription. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Closes #47909.
This commit starts by adding two RPCs to the Internal service:
The commit then introduces new
kvtenant.Proxy
object. Proxy mediates the communication of cluster-wide state to sandboxed SQL-only tenant processes through a restricted interface. A Proxy is seeded with a set of one or more network addresses that reference existing KV nodes in the cluster (or a load-balancer which fans out to some/all KV nodes). On startup, it establishes contact with one of these nodes to learn about the topology of the cluster and bootstrap the rest of SQL <-> KV network communication.Proxy has two main roles:
First, Proxy is capable of providing information on each of the KV nodes in the cluster in the form of NodeDescriptors. This obviates the need for SQL-only tenant processes to join the cluster-wide gossip network. In doing so, it satisfies the
NodeDescStore
interface and can be used as anAddressResolver
with a small adapter.Second, Proxy is capable of providing Range addressing information in the form of RangeDescriptors through delegated RangeLookup requests. This is necessary because SQL-only tenants are restricted from reading Range Metadata keys directly. Instead, the RangeLookup requests are proxied through existing KV nodes while being subject to additional validation (e.g. is the Range being requested owned by the requesting tenant?). In doing so, it satisfies the
RangeDescriptorDB
interface and can be used to delegate all DistSender/RangeCache descriptor lookups to KV nodes.With this commit, we can mostly run a SQL-only tenant process without joining the KV cluster's gossip network. This works if I comment out a few of the uses of gossip due to #49692 and #47150 in SQL. Notably, with the call to
DeprecatedRegisterSystemConfigChannel
insql.Server.Start
removed, I can removeGossip
frommakeSQLServerArgs
entirely and things "just work".