-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvtenant: implement SystemConfigProvider, remove Gossip from SQL-only server #52034
kvtenant: implement SystemConfigProvider, remove Gossip from SQL-only server #52034
Conversation
bc879ac
to
34993f8
Compare
34993f8
to
0806f49
Compare
Fixes cockroachdb#47898. Rebased on cockroachdb#51503 and cockroachdb#52034. Ignore all but the last 3 commits. This commit adds a collection of access control policies for the newly exposed tenant RPC server. These authorization policies ensure that an authenticated tenant is only able to access keys within its keyspace and that no tenant is able to access data from another tenant's keyspace through the tenant RPC server. This is a major step in providing crypto-backed logical isolation between tenants in a multi-tenant cluster. The existing auth mechanism is retained on the standard RPC server, which means that the system tenant is still able to access any key in the system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. As far as I'm concerned, this is ready to go when the fixups I will be pushing shortly are squashed.
Reviewed 1 of 1 files at r1, 15 of 15 files at r2, 21 of 21 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/ccl/kvccl/kvtenantccl/proxy.go, line 242 at r2 (raw file):
// GetSystemConfig implements the config.SystemConfigProvider interface. func (p *Proxy) GetSystemConfig() *config.SystemConfig { // TODO DURING REVIEW: how does this work with Gossip when the SystemConfig
Yes, I think the current code plays fast and loose here. We typically have Gossip connected by the time SQL needs a system config and when gossip connects, I suppose the system config is among the first things that are immediately received.
In the Connector (i.e. tenant.Proxy), I think we want to wait here:
cockroach/pkg/ccl/kvccl/kvtenantccl/proxy.go
Line 162 in 6e1a56b
close(p.startupC) |
until we've populated the system config. Since we call .Start()
first thing when starting the SQL server, that effectively means that the system config is always available. Even in this code now, though, we're roughly in the same place as before: we're not guaranteed to have the zone config, but since we can't reach this code before a subscription has received data, it's quite likely that we do have it already.
I tried something new - I made a fixup commit for you and am pushing it to your branch past the review. Let me know if that's something we should do more often.
PS: gh
makes this kind of thing quite easy.
pkg/ccl/kvccl/kvtenantccl/proxy.go, line 255 at r2 (raw file):
func (p *Proxy) RegisterSystemConfigChannel() <-chan struct{} { // Create channel that receives new system config notifications. // The channel has a size of 1 to prevent proxy from blocking on it.
nit: "having to block on it" is more precise, right? We never blocking-send into this thing.
Added fixup commit as well.
4159d2c
to
93b694f
Compare
I actually went back and squashed the fixups myself. That way you can get a green CI when you open this hopefully. Again, let me know if you think this is a good way to go about it, I have a feeling it can avoid time typing things into review and copying ideas from reviewable over to code. |
Only the entries are used, so it's confusing to be manipulating a full *SystemConfig. This also saves an allocation per update, though that's hardly important.
… server Fixes cockroachdb#49445. This commit introduces a new SystemConfigProvider abstraction, which is capable of providing the SystemConfig, as well as notifying clients of updates to the SystemConfig. Gossip already implements this interface. The commit then updates SQL to use this new dependency in place of Gossip whenever it needs to access the SystemConfig. After making this change, it then updates the kvtenant.Proxy to implement the new SystemConfigProvider interface. This is powered by the GossipSubscription RPC that was added in cockroachdb#50520. The commit updates the subscription to also match on the "system-db" gossip key, and just like that, it can provide the SystemConfig to SQL [*]. Finally, with the kvtenant.Proxy serving the role of a SystemConfigProvider to SQL when applicable, we're able to remove gossip entirely from StartTenant. SQL-only servers will no longer join the gossip network, which is a nice milestone for all of this work. [*] there are a few remaining questions about how exactly we want to enforce an access control policy on the system config gossip pattern. See the updated comment in `Node.GossipSubscription`. For now, we're just returning the entire SystemConfig to the subscription.
Addresses a TODO and simplifies code.
93b694f
to
6c1a90b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
pkg/ccl/kvccl/kvtenantccl/proxy.go, line 242 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Yes, I think the current code plays fast and loose here. We typically have Gossip connected by the time SQL needs a system config and when gossip connects, I suppose the system config is among the first things that are immediately received.
In the Connector (i.e. tenant.Proxy), I think we want to wait here:
cockroach/pkg/ccl/kvccl/kvtenantccl/proxy.go
Line 162 in 6e1a56b
close(p.startupC) until we've populated the system config. Since we call
.Start()
first thing when starting the SQL server, that effectively means that the system config is always available. Even in this code now, though, we're roughly in the same place as before: we're not guaranteed to have the zone config, but since we can't reach this code before a subscription has received data, it's quite likely that we do have it already.I tried something new - I made a fixup commit for you and am pushing it to your branch past the review. Let me know if that's something we should do more often.
PS:
gh
makes this kind of thing quite easy.
I'm mostly fine with this approach, though it does make things confusing on the PR author's side because now my remote has a newer revision than my local branch and I need to force pull it down. In this case, that's fine, but I often have PRs where I allow my remote branch to lag behind my local branch (mainly due to origin/master rebases), so this kind of thing would cause the two to completely diverge, which would end up creating more work than the alternative. Maybe gh
makes this all more seamless, but I don't know if it's really worth the hassle.
It also makes it a bit awkward to determine what actually changed. I'd say if we're going to do this, we should keep the fixups in a separate commit and let the PR author squash them in when they want to. I don't mind the extra CI round on large PRs like this.
pkg/ccl/kvccl/kvtenantccl/proxy.go, line 255 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
nit: "having to block on it" is more precise, right? We never blocking-send into this thing.
Added fixup commit as well.
👍 Made the same change in Gossip.RegisterSystemConfigChannel
.
Build failed (retrying...): |
Build failed: |
bors r+ |
Build succeeded: |
Fixes cockroachdb#47898. Rebased on cockroachdb#51503 and cockroachdb#52034. Ignore all but the last 3 commits. This commit adds a collection of access control policies for the newly exposed tenant RPC server. These authorization policies ensure that an authenticated tenant is only able to access keys within its keyspace and that no tenant is able to access data from another tenant's keyspace through the tenant RPC server. This is a major step in providing crypto-backed logical isolation between tenants in a multi-tenant cluster. The existing auth mechanism is retained on the standard RPC server, which means that the system tenant is still able to access any key in the system.
51803: cmd/docgen: add HTTP extractor r=mjibson a=mjibson Add a way to extract docs from the status.proto HTTP endpoint. These can be imported into the docs project as needed. Release note: None 52083: roachtest: small misc r=andreimatei a=andreimatei See individual commits. 52094: rpc: implement tenant access control policies at KV RPC boundary r=nvanbenschoten a=nvanbenschoten Fixes #47898. Rebased on #51503 and #52034. Ignore all but the last 3 commits. This commit adds a collection of access control policies for the newly exposed tenant RPC server. These authorization policies ensure that an authenticated tenant is only able to access keys within its keyspace and that no tenant is able to access data from another tenant's keyspace through the tenant RPC server. This is a major step in providing crypto-backed logical isolation between tenants in a multi-tenant cluster. The existing auth mechanism is retained on the standard RPC server, which means that the system tenant is still able to access any key in the system. 52352: sql/pgwire: add regression test for varchar OIDs in RowDescription r=jordanlewis a=rafiss See issue #51360. The bug described in it was fixed somewhat accidentally, so this test will verify that we don't regress again. Release note: None 52386: opt: add SerializingProject exec primitive r=RaduBerinde a=RaduBerinde The top-level projection of a query has a special property - it can project away columns that we want an ordering on (e.g. `SELECT a FROM t ORDER BY b`). The distsql physical planner was designed to tolerate such cases, as they were much more common with the heuristic planner. But the new distsql exec factory does not; it currently relies on a hack: it detects this case by checking if the required output ordering is `nil`. This is fragile and doesn't work in all cases. This change adds a `SerializingProject` primitive which is like a SimpleProject but it forces serialization of all parallel streams into one. The new primitive is used to enforce the final query presentation. We only need to pass column names for the presentation, so we remove `RenameColumns` and remove the column names argument from `SimpleProject` (simplifying some execbuilder code). We also fix a bug in `ConstructSimpleProject` where we weren't taking the `PlanToStreamColMap` into account when building the projection. Release note: None Co-authored-by: Matt Jibson <matt.jibson@gmail.com> Co-authored-by: Andrei Matei <andrei@cockroachlabs.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com> Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
Fixes #49445.
This commit introduces a new SystemConfigProvider abstraction, which is capable of providing the SystemConfig, as well as notifying clients of updates to the SystemConfig. Gossip already implements this interface. The commit then updates SQL to use this new dependency in place of Gossip whenever it needs to access the SystemConfig.
After making this change, it then updates the kvtenant.Proxy to implement the new SystemConfigProvider interface. This is powered by the GossipSubscription RPC that was added in #50520. The commit updates the subscription to also match on the "system-db" gossip key, and just like that, it can provide the SystemConfig to SQL [*].
Finally, with the kvtenant.Proxy serving the role of a SystemConfigProvider to SQL when applicable, we're able to remove gossip entirely from StartTenant. SQL-only servers will no longer join the gossip network, which is a nice milestone for all of this work.
[*] there are a few remaining questions about how exactly we want to enforce an access control policy on the system config gossip pattern. See the updated comment in
Node.GossipSubscription
. For now, we're just returning the entire SystemConfig to the subscription.