-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: hook up tenant capabilities subsystem on startup #96390
Conversation
7a761a8
to
31014b6
Compare
31014b6
to
b48c0ea
Compare
b48c0ea
to
55e9f5b
Compare
1edb01e
to
08e901a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 38 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @ecwall, and @knz)
pkg/server/server.go
line 1925 at r1 (raw file):
return errors.Wrap(err, "failed to initialize the tenant settings watcher") } if err := s.node.tenantCapabilitiesWatcher.Start(ctx); err != nil {
I don't understand what the benefit is of having the capabilities watcher attached to node
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian, @ecwall, and @knz)
pkg/server/server.go
line 1925 at r1 (raw file):
Previously, dhartunian (David Hartunian) wrote…
I don't understand what the benefit is of having the capabilities watcher attached to
node
.
I was following the pattern of creating a Watcher
during NewServer
and starting it here. node
felt like the right place to put the Watcher
, given it runs on KV nodes. Is there a better alternative than node
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is pretty good.
Reviewed 38 of 38 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @dhartunian, and @ecwall)
pkg/multitenant/tenantcapabilities/capabilities.go
line 47 at r1 (raw file):
// is not allowed to execute the supplied batch request given the capabilities // it possesses. HasCapabilityForBatch(context.Context, roachpb.TenantID, *roachpb.BatchRequest) error
IMHO we should separate "ok/fail" (a boolean) from an error here. They say different stories.
We've separated them too in e.g. (*server.authenticationServer).verifySession()
.
pkg/multitenant/tenantcapabilities/capabilities.go
line 52 at r1 (raw file):
// Authorizer post-creation. The Authorizer uses the Reader to consult the // global tenant capability state to authorize incoming requests. This // function cannot be used to update the Reader; it may be called at-most
I don't fully understand the restriction about "at most once". What bad things happen if you call it multiple times?
pkg/multitenant/tenantcapabilities/tenantcapabilitiesauthorizer/authorizer.go
line 46 at r1 (raw file):
} if a.capabilitiesReader == nil { log.Fatal(ctx, "trying to perform capability check when no Reader exists")
return errors.AssertionFailedf
and if you feel fancy logcrash.ReportOrPanic
. log.Fatal is out of style.
pkg/multitenant/tenantcapabilities/tenantcapabilitiesauthorizer/authorizer.go
line 74 at r1 (raw file):
func (a *Authorizer) BindReader(ctx context.Context, reader tenantcapabilities.Reader) { if a.capabilitiesReader != nil { log.Fatal(ctx, "cannot bind a tenant capabilities reader more than once")
see my previous comment
pkg/multitenant/tenantcapabilities/tenantcapabilitiesauthorizer/noop.go
line 37 at r1 (raw file):
// BindReader implements the tenantcapabilities.Authorizer interface. func (n *NoopAuthorizer) BindReader(context.Context, tenantcapabilities.Reader) { panic("disallowed")
I don't understand this -why not?
pkg/rpc/auth_tenant.go
line 174 at r1 (raw file):
// contained in here, there should be a corresponding capability. Once that's // done, we can get rid of this loop entirely and perform all checks inside // the capabilities Authorizer above.
I don't understand this comment. To me, this entire check should be done inside HasCapabilityForBatch()
:
- request types that have a capability for them get a check.
- others are allowed/denied depending on the chosen default behavior.
pkg/rpc/context.go
line 503 at r1 (raw file):
// Authorizer provides a handle into the tenantcapabilities subsystem. It // allows KV nodes to perform capability checks for incoming tenant requests.
Let's call it TenantRPCAuthorizer
. Authorizer
is a bit too generic.
pkg/server/server.go
line 1925 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
I was following the pattern of creating a
Watcher
duringNewServer
and starting it here.node
felt like the right place to put theWatcher
, given it runs on KV nodes. Is there a better alternative thannode
?
Server
(variable s
in this scope)
pkg/ccl/multitenantccl/tenantcapabilitiesccl/testdata/can_admin_split
line 25 at r1 (raw file):
SELECT pretty FROM [ALTER TABLE t SPLIT AT VALUES (0)] ---- pq: at or near "alter": syntax error
what's going on here - what's this syntax error about?
pkg/ccl/multitenantccl/tenantcapabilitiesccl/capabilities_test.go
line 94 at r1 (raw file):
// Create a tenant; we also want to allow test writers to issue // ALTER TABLE ... SPLIT statements, so configure the settings as such.
I think it would feel more natural to me if we saw the ALTER TENANT ALL SET CLUSTER SETTING ... = true
inside the testdata with query-sql-system
instead of as an override here.
Ditto for the others we'll be checking afterwards.
pkg/rpc/auth_test.go
line 201 at r1 (raw file):
defer leaktest.AfterTest(t)() tenID := roachpb.MustMakeTenantID(10) makeDisallowedAdminReq := func(key string) roachpb.Request {
I'm not going to object to you extracting makeSpan
into a top level function, but then call it makeSpanShared
then alias it here with
makeSpan := func(key) { return makeSpanShared(t, key) }
to remove the rest of the diff below. It makes the test harder to read/maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @dhartunian, and @ecwall)
pkg/ccl/multitenantccl/tenantcapabilitiesccl/capabilities_test.go
line 94 at r1 (raw file):
Previously, knz (Raphael 'kena' Poss) wrote…
I think it would feel more natural to me if we saw the
ALTER TENANT ALL SET CLUSTER SETTING ... = true
inside the testdata withquery-sql-system
instead of as an override here.
Ditto for the others we'll be checking afterwards.
Discussed separately: Please link to #96512 in a comment
When that issue is resolved, we add a new datadriven directive that delays the initialization of the secondary tenant until some system tenant SQL has completed executing.
08e901a
to
8bd0c81
Compare
7235021
to
f3b62f9
Compare
8d8251f
to
948bfa7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
bors r=knz
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian, @ecwall, @knz, and @msbutler)
pkg/rpc/auth_tenant.go
line 174 at r1 (raw file):
Previously, knz (Raphael 'kena' Poss) wrote…
Please spell out your thinking/vision about the organization of this code in the commit message.
Good call, added some words.
948bfa7
to
46adf8b
Compare
Canceled. |
bors r+ |
Build failed (retrying...): |
Build failed (retrying...): |
Build failed (retrying...): |
Build failed (retrying...): |
Build failed (retrying...): |
We might be checking in something flakey here (build), or already did in a previous PR: |
bors r- |
Canceled. |
This patch wires up the tenant capabilities subsystem during server startup. This includes both starting the subsystem and adding a handle to the Authorizer so that it can be used by GRPC interceptors to perform authorization checks. While working through this patch, I realized we can't instantiate an Authorizer with a handle to the tenant capability state (tenantcapabilities.Reader). This is because the afformentioned GRPC setup happens early on during the Server startup process, and at that point we do not have access to the dependencies required to setup the capabilities Watcher (which is what provides the Reader interface to the Authorizer). To break this dependency cycle, we end up with an approach to lazily bind the Reader to the Authorizer. With the Authorizer wired up, we can now start using it to perform capability checks for incoming tenant requests. Currently, this is limited to batch requests. Note that the Authorizer is only responsible for performing capability checks -- other authorization checks, such as bounds checks, continue to happen outside of Authorizer. This patch also adds a datadriven framework to test tenant capabilities end to end. The nice thing about it is it hides the asynchronous nature of capability checks from test writers. The hope is that we'll be able to extend this as we add more capabilities. Informs cockroachdb#94643 Release note: None
46adf8b
to
724598b
Compare
Wasn't a flake -- this skewed with #96659. Thanks for catching. bors r+ |
Build succeeded: |
This patch wires up the tenant capabilities subsystem during server
startup. This includes both starting the subsystem and adding a handle
to the Authorizer so that it can be used by GRPC interceptors to perform
authorization checks.
While working through this patch, I realized we can't instantiate an
Authorizer with a handle to the tenant capability state
(tenantcapabilities.Reader). This is because the afformentioned GRPC
setup happens early on during the Server startup process, and at that
point we do not have access to the dependencies required to setup the
capabilities Watcher (which is what provides the Reader interface to
the Authorizer). To break this dependency cycle, we end up with an
approach to lazily bind the Reader to the Authorizer.
This patch also adds a datadriven framework to test tenant capabilities
end to end. The nice thing about it is it hides the asynchronous nature
of capability checks from test writers. The hope is that we'll be able
to extend this as we add more capabilities.
Informs #94643
Release note: None