-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli/flags,config: new flag for tenant KV listen addr #50503
cli/flags,config: new flag for tenant KV listen addr #50503
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Let's also write the listener files for the tenant port (if the flag was changed):
https://github.com/cockroachdb/cockroach/blob/master/pkg/server/server.go#L769-L775
Is it OK for me to start working off this branch or do you anticipate more changes to be piled on?
Reviewed 21 of 21 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/cli/cliflags/flags.go, line 534 at r1 (raw file):
To actually use separate bindings, it is recommended to specify both flags and use a different port number via --listen-addr, for example --tenant-addr=:26257 --listen-addr=:26258. Ensure that --join is set
To make our life a little easier, I was thinking to suggest 36257 for the tenant.
pkg/cli/cliflags/flags.go, line 545 at r1 (raw file):
TenantAdvertiseAddr = FlagInfo{ Name: "advertise-tenant-addr",
nit: you write --advertise-tenant-addr
in a few places, can those use "--"+TenantAdvertiseAddr.Name
pkg/roachpb/metadata.go, line 483 at r1 (raw file):
} // CheckedTenandAddress returns the value of TenantAddress if set. If not,
Tenant here and in method name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS this is still marked as draft, not sure it's intentional
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
Test failure for your convenience:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also write the listener files for the tenant port (if the flag was changed):
Yeah, we're writing these. This is tested by TestListenerFileCreation
.
Test failure for your convenience:
Fixed.
PS this is still marked as draft, not sure it's intentional
Yes, that was intentional. It's no longer a draft and gets all the way to starting a second gRPC server.
I updated this with a pair of follow on commits that changes the tenant proxy to route tenant KV traffic to this new tenant KV address when a node is configured with one. We now have unit tests and roachtests that exercise this logic.
PTAL.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
pkg/cli/cliflags/flags.go, line 534 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
To make our life a little easier, I was thinking to suggest 36257 for the tenant.
I like the idea of using 36257 in our deployments, but making that the default port iff --tenant-addr
is specified but its port is omitted is tricky and not obviously worth the complexity, so I'll leave this as-is for now.
pkg/cli/cliflags/flags.go, line 545 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
nit: you write
--advertise-tenant-addr
in a few places, can those use"--"+TenantAdvertiseAddr.Name
I only see one instance of this outside of comments:
return errors.Wrap(err, "invalid --advertise-tenant-addr")
Is that what you're referring to?
pkg/roachpb/metadata.go, line 483 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Tenant here and in method name
Done.
4655943
to
2fb707e
Compare
Huh, Github isn't picking up the changes on this branch. Looks like there's an ongoing incident: https://www.githubstatus.com/. |
557be8d
to
db74c96
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding this:
The
--tenant-addr
and--advertise-tenant-addr
flags are inspired by the--sql-addr
and--advertise-sql-addr
flags. The two pairs have similar fallback behavior, as is discussed in depth in 6cb71bf.
-
We found out that
--advertise-sql-addr
is not so useful after all. In contrast with the RPC addresses it so happens that crdb does not need to self-connect over public/private network boundaries, so that distinction was over-engineered (and is turning out unnecesary). -
I wonder here if
--advertise-tenant-addr
is actually necessary. This would only make sense if you had the following combination of design requirements:
- a tenant SQL server needs to run in a separate DC / internet region than the KV server, so it needs to know the public address of the KV server; and
- a tenant SQL server must discover the address of the KV server automatically by using some advertisement information from the KV server. (If there is no need for auto-discovery, there's no need to distinguish addr vs advertised addr)
The first requirement was not on planning for CC free tier, as far as I know; but even then, the second requirement is completely non-sensical with the KV/SQL disctinction, and therefore I think it makes no sense whatsoever to try to develop an "advertised addr".
Unless I misunderstood something?
Reviewed 1 of 21 files at r1, 3 of 16 files at r2, 1 of 5 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
Good point, I hear what you're saying. I added the flag primarily to maintain consistency between this pair of flags and the the
You are correct that the first requirement is not applicable to CC free tier. As far as I know, we don't plan to separate the public address of these KV servers from their private address. What I'm less sure about is whether there will ever be reason to make such distinctions. We do have aspirations to provide this infrastructure to others, so I don't think we should limit ourselves entirely to just how we plan to deploy the thing. As for the second requirement, tenant SQL servers do need to discover the address of the KV servers automatically. This is why these advertised addrs are stored on NodeDescriptors that are propagated to SQL servers through the tenant proxy. I might be misunderstanding what you mean with this point. |
489abb1
to
13bb1d7
Compare
That is ... odd. Are you telling me that a node descriptor arrives to the SQL server from the client-side proxy? How does the proxy even learn of the node descriptor then? |
13bb1d7
to
961cec5
Compare
SQL processes are started with a set of seed KV addresses, passed in through the |
I also find the "proxy" naming for this subsystem a little unfortunate due to the ambiguity. It's not really a proxy, anyway - no stand-alone component and it's not like it's passing anything "through" really, it's merely subscribing to updates from KV. It is more of a "connector" or "gateway" IMO. For example, we could rename the proxy to |
As of this commit, tenants would use their proper tenant client certs if it weren't for a manual override that was added. This override exists because the KV layer can not yet authenticate tenant client certs (this will change soon, in a follow-up to cockroachdb#50503). However, uncommenting both the override and the hack in `pkg/security/securitytest/test_certs/regenerate.sh` to make the tenant client certs match those used by the KV nodes gives early validation that this "will work" once the KV side plays ball. Touches cockroachdb#47898. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I (locally) rebased #51498 on top of this and will see about adding the right certs on the server side.
Reviewed 13 of 16 files at r2, 6 of 6 files at r4, 5 of 5 files at r5, 1 of 1 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/cli/cliflags/flags.go, line 534 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I like the idea of using 36257 in our deployments, but making that the default port iff
--tenant-addr
is specified but its port is omitted is tricky and not obviously worth the complexity, so I'll leave this as-is for now.
How about updating this to --listen-addr=:26257
and --tenant-addr=:36257
to match what we expect to be the canonical use?
pkg/cli/cliflags/flags.go, line 545 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I only see one instance of this outside of comments:
return errors.Wrap(err, "invalid --advertise-tenant-addr")
Is that what you're referring to?
Looks like it (based on my on grepping, the original comment is too long ago to remember)
pkg/cmd/roachtest/multitenant.go, line 28 at r6 (raw file):
// Don't bind to external interfaces when running locally. tenantAddr := ifLocal("127.0.0.1", "0.0.0.0") + ":0" c.Start(ctx, t, c.All(), startArgs("--args=--tenant-addr="+tenantAddr))
Mind adding a comment here that we're starting a tenant server on another port,
pkg/cmd/roachtest/multitenant.go, line 33 at r6 (raw file):
require.NoError(t, err) kvAddrs := c.ExternalAddr(ctx, c.All())
Shouldn't this be the tenant server address? I think as written the proxy will connect to the "wrong" port on KV (which will still work because of lack of the real auth).
pkg/server/testserver.go, line 793 at r5 (raw file):
} // ServingTenantAddr returns the server's Tenant address. Should be used by clients.
"by clients"?
I went to #50520 to educate myself. In light of that, I retract my comment on the "advertise flag being nonsensical". Then, for consistency with the other flags, you can keep it. So the code LGTM. That said I agree with Tobias that the word "proxy" is problematic. I like the proposals for "connector" and "discovery". |
This patch introduces the ability to split off the tenant KV server into a separate port, using the new command-line flag `--tenant-addr`. This is motivated primarily by security concerns. We plan to use different TLS configs for KV<->KV kv/replication/gossip traffic and SQL<->KV tenant kv traffic. The easiest way to facilitate this is to listen to tenant KV traffic on a separate port and separate gRPC server. The `--tenant-addr` and `--advertise-tenant-addr` flags are inspired by the `--sql-addr` and `--advertise-sql-addr` flags. The two pairs have similar fallback behavior, as is discussed in depth in 6cb71bf.
This commit hooks up the tenant proxy to route all KV traffic from a tenant SQL process to a node's tenant KV address, if on is configured. As of the previous commit, TestServer now listens to tenant KV traffic on a separate port (see `makeTestConfigFromParams`) and sets a TenantAddress field in its NodeDescriptor, so with this change, any logic test that uses `StartTenant` will now test this behavior. As a sanity check, if we don't set up a second gRPC server in `startListenRPCAndSQL`, all 3node-tenant logictests fail.
This commit configures the KV cluster in the `acceptance/multitenant` roachtest to use a separate tenant address, which instructs the nodes in the cluster to split off a second gRPC server on port 36257.
Removes some room for inconsistency.
961cec5
to
eaa01fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
That said I agree with Tobias that the word "proxy" is problematic. I like the proposals for "connector" and "discovery".
I empathize with the confusion. I'll rename to "connector" in a follow on PR.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @tbg)
pkg/cli/cliflags/flags.go, line 534 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
How about updating this to
--listen-addr=:26257
and--tenant-addr=:36257
to match what we expect to be the canonical use?
Done.
pkg/cli/cliflags/flags.go, line 545 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Looks like it (based on my on grepping, the original comment is too long ago to remember)
I don't feel strongly for error strings, but done throughout addr_validation.go
.
pkg/cmd/roachtest/multitenant.go, line 28 at r6 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Mind adding a comment here that we're starting a tenant server on another port,
Done.
pkg/cmd/roachtest/multitenant.go, line 33 at r6 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Shouldn't this be the tenant server address? I think as written the proxy will connect to the "wrong" port on KV (which will still work because of lack of the real auth).
Good point, done.
pkg/server/testserver.go, line 793 at r5 (raw file):
Previously, tbg (Tobias Grieger) wrote…
"by clients"?
Done.
bors r+ |
Build failed (retrying...) |
Build succeeded |
As of this commit, tenants would use their proper tenant client certs if it weren't for a manual override that was added. This override exists because the KV layer can not yet authenticate tenant client certs (this will change soon, in a follow-up to cockroachdb#50503). However, uncommenting both the override and the hack in `pkg/security/securitytest/test_certs/regenerate.sh` to make the tenant client certs match those used by the KV nodes gives early validation that this "will work" once the KV side plays ball. Touches cockroachdb#47898. Release note: None
As of this commit, tenants would use their proper tenant client certs if it weren't for a manual override that was added. This override exists because the KV layer can not yet authenticate tenant client certs (this will change soon, in a follow-up to cockroachdb#50503). However, uncommenting both the override and the hack in `pkg/security/securitytest/test_certs/regenerate.sh` to make the tenant client certs match those used by the KV nodes gives early validation that this "will work" once the KV side plays ball. Touches cockroachdb#47898. Release note: None
As of this commit, tenants would use their proper tenant client certs if it weren't for a manual override that was added. This override exists because the KV layer can not yet authenticate tenant client certs (this will change soon, in a follow-up to cockroachdb#50503). However, uncommenting both the override and the hack in `pkg/security/securitytest/test_certs/regenerate.sh` to make the tenant client certs match those used by the KV nodes gives early validation that this "will work" once the KV side plays ball. Touches cockroachdb#47898. Release note: None
As of this commit, tenants would use their proper tenant client certs if it weren't for a manual override that was added. This override exists because the KV layer can not yet authenticate tenant client certs (this will change soon, in a follow-up to cockroachdb#50503). However, uncommenting both the override and the hack in `pkg/security/securitytest/test_certs/regenerate.sh` to make the tenant client certs match those used by the KV nodes gives early validation that this "will work" once the KV side plays ball. Touches cockroachdb#47898. Release note: None
As of this commit, tenants would use their proper tenant client certs if it weren't for a manual override that was added. This override exists because the KV layer can not yet authenticate tenant client certs (this will change soon, in a follow-up to cockroachdb#50503). However, uncommenting both the override and the hack in `pkg/security/securitytest/test_certs/regenerate.sh` to make the tenant client certs match those used by the KV nodes gives early validation that this "will work" once the KV side plays ball. Touches cockroachdb#47898. Release note: None
See discussion in cockroachdb#50503. We found "proxy" confusing, because the object was in the same process as the rest of the SQL-only process instead of run as a stand-alone component. "Connector" doesn't create the same confusion.
52599: kv/kvtenant: rename Proxy to Connector r=nvanbenschoten a=nvanbenschoten See discussion in #50503. We found "proxy" confusing, because the object was in the same process as the rest of the SQL-only process instead of run as a stand-alone component. "Connector" doesn't create the same confusion. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
See discussion in cockroachdb#50503. We found "proxy" confusing, because the object was in the same process as the rest of the SQL-only process instead of run as a stand-alone component. "Connector" doesn't create the same confusion.
Fixes #51464.
This patch introduces the ability to split off the tenant KV server into a
separate port, using the new command-line flag
--tenant-addr
.This is motivated primarily by security concerns. We plan to use different
TLS configs for KV<->KV kv/replication/gossip traffic and SQL<->KV tenant
kv traffic. The easiest way to facilitate this is to listen to tenant KV
traffic on a separate port and separate gRPC server.
The
--tenant-addr
and--advertise-tenant-addr
flags are inspired by the--sql-addr
and--advertise-sql-addr
flags. The two pairs have similarfallback behavior, as is discussed in depth in 6cb71bf.