-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli,security: add --cert-principal-map #45819
cli,security: add --cert-principal-map #45819
Conversation
@aaron-crl, @knz this works, though I'm not convinced of the use of a regexp for this purpose. The regexp is difficult to specify on the The This PR needs more testing, and an end-to-end test. I did verify manually this works end-to-end. |
460f398
to
e02c664
Compare
Another question: currently we "fail open" in the sense that if the regexp doesn't match, we return the full |
e02c664
to
ffcb62b
Compare
This seems incorrect. I'm glad the PR is out but I won't be able to look at it until tomorrow. also, Aaron is OOO until Monday. I see the wisdom in getting this done for 20.1 but I do not see wisdom in rushing it through the weekend. |
Looking forward to your sage guidance on this.
Completely agreed. The creation of this PR occupied my trans-atlantic flight. The review can wait until next week. I would not be disappointed if we discover a fatal flaw in this approach. We will have learned something. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I just realized while reviewing this that you've kept the extraction working on CommonName exclusively. That's a problem: ACM does not enable customization of CN. The field we want to read the username from is SubjectAlternateName(s) - a list of strings. So it's not just a regexp that's needed, also a field name.
I'd argue for
--cert-name-pattern=<fieldname>:<regexp>
with a default ofCN:(.*)
. -
the rest of the change looks good modulo the previous discussion on the fallback. IMHO if the regexp doesn't match this should be rejected with an error., instead of producing the CN field silently. There are multiple reasons for that:
- it's more likely than not that the user made a mistake with their regexp (especially given your concern with escaping).
- by silently extending to the entire field's value, we may be extending the set of allowed users beyond what the DBA intends
- the need to match a prefix / sub-part of the string is even more important when working off SubjectAlternativeName.
Reviewed 9 of 15 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @knz, and @petermattis)
pkg/cli/cliflags/flags.go, line 638 at r1 (raw file):
must specify exactly one parenthesized subexpression which becomes the user if the regular expression matches a certificate's CommonName. If the regular expression does not match, the entire CommonName field is used for the user.
Something about which TLS field is looked at is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 15 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @knz, and @petermattis)
I thought ACM did allow customization of CN in so far as you could specify any subdomain of a domain you owned. And doesn't ACM require that SubjectAlternateName is also a DNS name you own. If it allowed arbitrary bits in there that seems like it could be a problem. I'm not sure about restricting the pattern to CN or SAN. What if the cert contains multiple SubjectAlternateNames? Should we return the first SAN that matches the regexp? Perhaps we can have that behavior in general. Match the regexp against CN. If it matches, return the submatch. If it doesn't match, return the first match in the list of SAN(s). If none match, return an error.
Ack. I'll make this change once we reach agreement on the field matching above. |
Ok so first thing first I was mistaken in one area in my previous comment - ACM does indeed populate the main (first) target of the cert in CN. I had misread the output of my commands. This means that this PR as-is is already moving the needle forward.
The requirement on ownership is only mandated when you use the ACM itself as CA. When using a private CA, there's no check.
I agree but that's a red herring. If you define a cert for, say
(Also this is only needed for ACM's public CA, see above)
yes! This is good and a feature - I'd definitely like my cert to be valid for both
I suppose we could yes.
I like that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that while this solution may solve this particular need, I feel it will further complicate our certificate user story. We are overloading the use of the node certs in ways that place additional external configuration and development burden on users and I'd really like to get away from that. I may be missing some context on the current design pattern so please forgive me (and point me to the right docs) if I am. I'm working on an RFC for an improved certificate story presently, and would rather push changes to the certificate and trust management mechanisms out until we have a chance to look at it more comprehensively than make feature changes that may be hard to support in retrospect or become inconsistent with a more robust IAM approach.
I’ve shared my working notes on certificate stories with both of you internally and would welcome feedback.
@aaron-crl Let's touch base about this tomorrow. I might have invoked Cunningham's Law with this PR: "the best way to get the right answer on the internet is not to ask a question; it's to post the wrong answer." |
Besides, I have cross-checked this PR and the proposed doc and they seem in agreement with each other. Maybe the only step needed is to make the link and overlap more explicit, e.g. in the release note. |
@aaron-crl gently explained that the approach in this PR opens the door to privilege escalation in certain scenarios. Doing so does require an attacker to be able to mint a certificate, but that is feasible if the certs are being provided by a domain certificate authority. His broad advice (@aaron-crl please correct me if I'm misstating this) is that we shouldn't be abusing the certs created by CAs used for domain certs for node identity. Rather, we should be considering this node identity and authorization an internal crypto primitive handled by a cluster-local CA. (This is essentially option 2 in his doc). I need to think this through a bit more, and possibly come up with more bad ideas for @aaron-crl to shoot down. I do really like the cluster-local CA idea, but it isn't feasible until 20.2. |
Understood for the node identity. What of SQL client principals? |
This is potentially susceptible to the same type of broken auth; I would strongly discourage this pattern in general. For example: Normal user certs would be of the form An attacker that gained access to a dev environment and had the ability to create a service named This presents a difficult to contextualize risk for operators where they may not realize how certs, wildcards, and the database interact and inadvertently create an insecure deployment. |
Would this be safe if we always stripped off a fixed suffix? For example, what if we stripped off the suffix |
It would certainly halt the prefix attack though I could see cases where it would lead to more complex but successful patterns where attackers might still gain access to reserved accounts (happy to chat about these offline). This raises the cost for an attacker but also limits the ways in which a user may specify their cert principles. I would avoid approaches that extract AuthN primitives from CN and SAN substrings. It adds complexity and reinforces our overloaded certificate-use practice by forcing the security principle in the cert to present differently in different circumstances. This undermines the flexibility of the credential and strength of the issuing trust authority. |
So let me stretch your viewpoint in a tangential technical direction, to make sure I understand. Today a SQL client presents a TLS client cert and also a principal name in the first payload packet after the TLS handshake has completed. We verify 3 things:
My questions:
I guess what I'm trying to get from this convo is "what happens with our current cert-based authn story" and in particular what can we offer to users who prefer to NOT use passwords for authentication. |
We should continue to extract the principal name and check it unless the operator has disabled certificate AuthN. This approach should not change current function from a user AuthN perspective except that it permits/requires the user's CA public key be distinct from the database's internal communications and service ones. I'm advocating against extracting substrings from these principals only. Certificate specified security principals should match database security principals. |
This is the part I'm still struggling to internalize. If the cert says the principle is |
I'm not sure I understand here.
Where would we do this, would this be a flag for every potential node that would connect to a running node? 🤔 |
Yes, that is what I was imagining. The general form of this would be to have a table mapping cert principle name to DB principle name. Something like: |
If I'm understanding correctly, this moves away from parsing a value out of the field and instead moves to whitelisting the principal names for all nodes at node start time? |
Ah, good catch, this is one of the advantages of using SAN over CN (it's a typed value). Given our current intended use (node principal names for FQDN certs issued to nodes). It's probably safe to just use the strings from DNSNames. We can revisit if we change this usage pattern. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given our current intended use (node principal names for FQDN certs issued to nodes). It's probably safe to just use the strings from DNSNames. We can revisit if we change this usage pattern.
👍
I'm going to flesh out the testing. I'll ping this PR again when it is ready for another look.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @knz, and @petermattis)
c25278e
to
1693062
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fleshed out the testing, and added an end-to-end test. This is ready for another look.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @knz, and @petermattis)
pkg/cli/interactive_tests/test_cert_advisory_validation.tcl, line 65 at r5 (raw file):
eexpect "interrupted" expect $prompt end_test
I'm currently only checking that --cert-principal-map
works for the node cert, but there is a common code path used for other certs. Is it worth checking that it works for client certs as well?
pkg/security/certs.go, line 269 at r5 (raw file):
} nodeUser := envutil.EnvOrDefaultString("COCKROACH_CERT_NODE_USER", "node")
This was added to make testing easier. See cli/interactive_tests
.
If it's validating client certs, then it should be able to accept client CN and SAN patterns. This would only be gated by what client principle names we permit AND what field type the SAN carries that value in. My expectation would be that this could then include EmailAddress and IP but not URI but I don't have data to support this. It might be worth asking our customer facing teams if they have exemplars. If my assumption holds here, I would recommend string compares against those two fields too then wait until we separate cert usages before revisiting. |
e4fe2f6
to
2154391
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @knz, and @petermattis)
pkg/security/certificate_loader.go, line 451 at r5 (raw file):
// Check Subject Common Name. // // TODO(peter): Test this function when a certPrincipalMap is present.
I just noticed I still have to handle these TODOs. I'll be tackling them tomorrow.
2154391
to
ca18a25
Compare
ca18a25
to
2f163a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides one comment with @knz regarding the flag help text, I think this is ready to go.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl and @knz)
pkg/cli/cliflags/flags.go, line 638 at r1 (raw file):
Previously, knz (kena) wrote…
Something about which TLS field is looked at is needed here.
I still need to address this.
pkg/security/certificate_loader.go, line 451 at r5 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I just noticed I still have to handle these TODOs. I'll be tackling them tomorrow.
These have been tackled. See TestManagerWithPrincipalMap
.
2f163a0
to
472aaf8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ready to go.
With some tech nits below, I agree.
Reviewed 4 of 13 files at r4, 7 of 7 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @petermattis)
pkg/cli/context.go, line 120 at r6 (raw file):
startCtx.serverInsecure = baseCfg.Insecure startCtx.serverSSLCertsDir = base.DefaultCertsDirectory startCtx.serverListenAddr = ""
Add the default for the new field here too. (= nil
I think)
pkg/cli/interactive_tests/test_cert_advisory_validation.tcl, line 65 at r5 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I'm currently only checking that
--cert-principal-map
works for the node cert, but there is a common code path used for other certs. Is it worth checking that it works for client certs as well?
I think this is OK.
pkg/rpc/context.go, line 131 at r6 (raw file):
// NodeUser. This is not a security concern, as RootUser has access to // read and write all data, merely good hygiene. For example, there is // no reason to permit the root user to send raw Raft RPCs.
@aaron-crl for bookkeeping please add this TODO as one of the items to work on in 20.2: "Separate the credentials needed for node-node RPCs from those needed for client-node RPCs."
pkg/rpc/context.go, line 132 at r6 (raw file):
// read and write all data, merely good hygiene. For example, there is // no reason to permit the root user to send raw Raft RPCs. var ok bool
@petermattis maybe share this loop and the check using a single exported function in the security
package, shared between here and the UserAuthCertHook
function.
pkg/security/auth.go, line 38 at r6 (raw file):
type UserAuthHook func(string, bool) error // SetCertPrincipalMap TODO(peter)
add docstring for the function.
pkg/security/auth.go, line 122 at r6 (raw file):
// except if the certificate user is NodeUser, which is allowed to // act on behalf of all other users. var ok bool
... here (see my comment above)
pkg/security/certificate_manager_test.go, line 77 at r6 (raw file):
defer func() { _ = security.SetCertPrincipalMap(nil) }() defer func() { _ = os.Setenv("COCKROACH_CERT_NODE_USER", security.NodeUser)
don't you need to also defer envutil.ClearEnvCache()
here? we're not re-reading the env var otherwise in other tests run in the same package.
Aside from @knz 's nits, this looks good to me. 👍 |
472aaf8
to
1dceca7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz)
pkg/cli/context.go, line 120 at r6 (raw file):
Previously, knz (kena) wrote…
Add the default for the new field here too. (
= nil
I think)
Done.
pkg/cli/cliflags/flags.go, line 638 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I still need to address this.
Done.
pkg/rpc/context.go, line 132 at r6 (raw file):
Previously, knz (kena) wrote…
@petermattis maybe share this loop and the check using a single exported function in the
security
package, shared between here and theUserAuthCertHook
function.
I was wondering if that repetition would bother you too much. Done.
pkg/security/auth.go, line 38 at r6 (raw file):
Previously, knz (kena) wrote…
add docstring for the function.
Done.
pkg/security/auth.go, line 122 at r6 (raw file):
Previously, knz (kena) wrote…
... here (see my comment above)
Done.
pkg/security/certificate_manager_test.go, line 77 at r6 (raw file):
Previously, knz (kena) wrote…
don't you need to also defer
envutil.ClearEnvCache()
here? we're not re-reading the env var otherwise in other tests run in the same package.
Yep. That's why tests were failing. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r7.
Reviewable status: complete! 1 of 0 LGTMs obtained
Add a `--cert-principal-map` flag to `start` which specifies a comma separated list of "cert-principal:db-principal" mappings which are used to map the principals found in certificates to DB principals. Renamed `security.GetCertificateUser` to `security.GetCertificateUsers` and changed it to return a list of principals in the servers. This list contains the Subject CommonName and all of the DNS SubjectAlternateNames. Changed the two users of this function to search the list for the user they are authenticating. Added `COCKROACH_CERT_NODE_USER` env var which controls the principal to place in node certs created by `cert create-node`. This is intended for testing purposes only. Fixes cockroachdb#28408 Release note (cli change): Add `--cert-principal-map` flag to `start` which specifies a comma separated list of "cert-principal:db-principal" mappings which are used to map the principals found in certificates to DB principals. This allows the usage of "node" and "root" certificates where the common name contains dots or a host name, which in turn allows such certificates to be generated by certificate authorities which place restrictions on the contents of the common name field. Release note (security update): Allow client and node certificates to specify the principal in either the Subject CommonName field or the SubjectAlternateNames field. Previously, the principal could only be specified in the Subject CommonName field. Release justification: Category 4. Low risk, high reward change to existing functionality. Should only affect users who specify `--cert-principal-map`, and the change itself is limited in scope.
1dceca7
to
f70f51a
Compare
TFTR! CI indicates there was a missing error check. Should be fixed now. |
bors r+ |
Build succeeded |
Add a
--cert-principal-map
flag tostart
which specifies a commaseparated list of "cert-principal:db-principal" mappings which are used
to map the principals found in certificates to DB principals.
Renamed
security.GetCertificateUser
tosecurity.GetCertificateUsers
and changed it to return a list of principals in the servers. This list
contains the Subject CommonName and all of the DNS
SubjectAlternateNames. Changed the two users of this function to search
the list for the user they are authenticating.
Added
COCKROACH_CERT_NODE_USER
env var which controls the principal toplace in node certs created by
cert create-node
. This is intended fortesting purposes only.
Fixes #28408
Release note (cli change): Add
--cert-principal-map
flag tostart
which specifies a comma separated list of "cert-principal:db-principal"
mappings which are used to map the principals found in certificates to
DB principals. This allows the usage of "node" and "root" certificates
where the common name contains dots or a host name, which in turn allows
such certificates to be generated by certificate authorities which place
restrictions on the contents of the common name field.
Release note (security update): Allow client and node certificates to
specify the principal in either the Subject CommonName field or the
SubjectAlternateNames field. Previously, the principal could only be
specified in the Subject CommonName field.
Release justification: Category 4. Low risk, high reward change to
existing functionality. Should only affect users who specify
--cert-principal-map
, and the change itself is limited in scope.