Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORE-2440 Add config for CRL and plumb it into seastar #18708

Merged
merged 4 commits into from
May 31, 2024

Conversation

oleiman
Copy link
Member

@oleiman oleiman commented May 29, 2024

This PR adds the ability to configure Certificate Revocation Lists in redpanda along with a variety of tests and
test infrastructure (the bulk of the change.

Closes CORE-2440

TODO:

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

Improvements

  • Add an optional TLS config field for CRL (Certificate Revocation List)

@oleiman oleiman self-assigned this May 29, 2024
@oleiman oleiman changed the title Configure CRLs for various things and plumb them into seastar Add config for CRL and plumb it into seastar May 29, 2024
@oleiman oleiman force-pushed the fips/core-2440/crl-base branch from 2c13f31 to 8d30485 Compare May 29, 2024 22:18
@oleiman oleiman requested review from michael-redpanda, a team and aanthony-rp and removed request for a team and michael-redpanda May 29, 2024 22:19
And push it down to the creds builder as appropriate

NOTE: Today, the CRL is optional. Per FIPS regulations, the ability to
require a CRL (via additional config) will be added in a future PR.

Also fixes up some unit and fixture tests

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
@oleiman oleiman force-pushed the fips/core-2440/crl-base branch from 8d30485 to 494d19e Compare May 29, 2024 23:20
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented May 30, 2024

new failures in https://buildkite.com/redpanda/redpanda/builds/49659#018fc6e4-5302-4341-ba06-b84f0d5f2da8:

"rptest.tests.tls_metrics_test.TLSMetricsTestChain.test_cert_chain_metrics"

new failures in https://buildkite.com/redpanda/redpanda/builds/49659#018fc6fd-b8b5-44de-989e-de752717dadc:

"rptest.tests.tls_metrics_test.TLSMetricsTestChain.test_cert_chain_metrics"

@vbotbuildovich
Copy link
Collaborator

oleiman added 3 commits May 29, 2024 21:29
Provides TLSCertManager interface for revoking certs along with various
plumbing. Notably, the CertificateAuthority tuple now looks like this:

namedtuple("CertificateAuthority", ["cfg", "key", "crt", "crl"])

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Note that the CRL is written out in a helper function. This allows
pushing new CRLs into a live cluster on a per-node basis. Very useful.

RedpandaService.write_crl_file(
    self, node: ClusterNode, ca: tls.CertificateAuthority)

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Tests effect of revoking certs for:

- kafka API mTLS
- internal kafka client mTLS for SR, implicitly PP, others
- listener TLS (i.e. https) for PP, implicitly covers SR, others
- internal RPC

Also tests (very loosely) the effect of pushing a non-cogent CRL to the
cluster. This breaks everything and we're not able to provide much context
at any API layer, so the test just confirms that some API is broken and
moves on. These behaviors will be fleshed out in a subsequent PR.

Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
@oleiman oleiman force-pushed the fips/core-2440/crl-base branch from 494d19e to 7d87dd1 Compare May 30, 2024 04:30
@oleiman
Copy link
Member Author

oleiman commented May 30, 2024

force push to fix up chain CA CRL logic (attach the CRL from the last signing CA in the chain)

@oleiman
Copy link
Member Author

oleiman commented May 30, 2024

/cdt

Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

Are the todos for this PR or for a future issue?

@@ -403,10 +424,20 @@ def create_cert(self,
self.certs[name] = cert
return cert

# TODO(oren): reasons enum
def revoke_cert(self, crt: Certificate, reason: str = "unspecified"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from https://www.openssl.org/docs/man3.0/man1/openssl-ca.html:

  • unspecified
  • keyCompromise
  • CACompromise
  • affiliationChanged
  • superseded
  • cessationOfOperation
  • certificateHold
  • removeFromCRL

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, ty. Assuming my CDT run passes, I'll add these in a follow-on PR.

Comment on lines +435 to +438
self._exec(
f'openssl crl -inform PEM -text -noout -in {crl} -out {revoked}')
with open(revoked, 'r') as f:
self._logger.debug(f"\n{f.read()}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nifty, so this will just list the certs that were revoked? Like their serial numbers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's actually quite nice. Like this:

Certificate Revocation List (CRL):
        Version 2 (0x1)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: O = Redpanda, CN = Redpanda Test CA
        Last Update: May 30 18:58:48 2024 GMT
        Next Update: May 30 18:58:48 2025 GMT
        CRL extensions:
            X509v3 CRL Number:
                1
Revoked Certificates:
    Serial Number: 01
        Revocation Date: May 30 18:58:48 2024 GMT
        CRL entry extensions:
            X509v3 CRL Reason Code:
                Unspecified
    Signature Algorithm: sha256WithRSAEncryption
    Signature Value:
        70:ce:93:90:be:20:3e:de:48:0f:e7:e0:b0:1a:b8:c2:3b:95:
        f6:4a:a2:47:6f:ff:8c:5e:1e:de:8a:7e:50:47:e2:69:f4:0b:
        7b:4f:96:36:80:83:9b:3e:29:12:67:2e:db:37:93:04:c2:7a:
        ac:e9:57:6b:db:b4:f0:89:cb:56:cc:5a:cd:06:5a:1f:4f:86:
        73:14:23:bf:88:0b:7b:2e:91:2d:3b:aa:02:d7:2c:aa:52:06:
        15:87:de:a0:41:eb:b8:26:6d:16:c2:53:88:26:89:f9:5a:0d:
        3a:c2:91:a5:9b:5a:ab:c7:4e:5f:0a:14:b2:76:64:68:47:b9:
        b4:4b:45:73:38:65:1a:2d:93:8e:1a:d3:06:46:51:a5:69:e6:
        48:05:d1:a7:8a:d3:e3:c6:ee:d7:38:a6:cb:28:5c:e2:01:31:
        2d:81:d8:62:d3:8c:67:5e:cf:df:b7:58:4e:69:8b:23:64:cd:
        da:e6:00:e6:d3:98:8e:0e:e0:d5:8c:d4:ce:97:a3:19:0a:3d:
        c1:e3:6f:89:b3:e5:c8:0c:9b:c3:e3:9b:58:5d:88:ca:a2:68:
        fc:1d:d3:3d:f7:5e:55:33:be:6e:b1:6a:ae:b6:ee:20:12:2e:
        8d:2f:6c:53:54:c6:86:f0:1c:71:d3:af:1c:34:50:b6:5b:c1:
        de:b4:51:3a

tests/rptest/services/tls.py Show resolved Hide resolved
Comment on lines +276 to +279
try:
self.rpk.list_schemas()
except:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm interested in knowing how it broke everything =)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I will document it on the ticket I think. Nothing worth reporting in tests since we'll address it all next week.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know what, I think this is basically all we get

DEBUG 2024-05-30 21:02:59,542 [shard 0:main] rpc - transport.cc:400 - could not parse header from server: {host: docker-rp-3, port: 33145}
DEBUG 2024-05-30 21:02:59,542 [shard 0:main] rpc - transport.cc:400 - could not parse header from server: {host: docker-rp-3, port: 33145}
DEBUG 2024-05-30 21:02:59,542 [shard 0:main] rpc - transport.cc:400 - could not parse header from server: {host: docker-rp-2, port: 33145}
DEBUG 2024-05-30 21:02:59,542 [shard 1:main] rpc - transport.cc:400 - could not parse header from server: {host: docker-rp-3, port: 33145}

With a bit of

DEBUG 2024-05-30 21:02:59,542 [shard 0:main] rpc - transport.cc:471 - RPC Client to {host: docker-rp-2, port: 33145} probes: { requests_sent: 36, requests_pending: 0, requests_completed: 36, request_errors: 0, request_timeouts: 0, in_bytes: 10335, out_bytes: 13761, connects: 1, connections: 0, connection_errors: 0, read_dispatch_errors: 0, corrupted_headers: 1, server_correlation_errors: 0, client_correlation_errors: 0, requests_blocked_memory: 0 }

^^ note the corrupted_headers field. Surprisingly well behaved (too much so probably), but nothing works afaict.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is testing cogency anyway...definition I'm picking up is more like "are all the certs listed in the CRL within its scope". So e.g.

  • The scope of my CRL is "certs that were revoked because the CA was compromised"
  • But there's a cert in the CRL that has "unspecified" reason.

Therefore, my CRL is not "cogent"

@oleiman
Copy link
Member Author

oleiman commented May 30, 2024

Are the todos for this PR or for a future issue?

Yeah, I'm gonna punt. I don't think we gain much in terms of overall verification by generating real CRLs in cmake. Probably worth doing, but looking at redpanda_sign_cert I'm guessing it won't be a quick thing :(

@oleiman oleiman changed the title Add config for CRL and plumb it into seastar CORE-2440 Add config for CRL and plumb it into seastar May 30, 2024
@oleiman
Copy link
Member Author

oleiman commented May 30, 2024

Are the todos for this PR or for a future issue?

Yeah, I'm gonna punt. I don't think we gain much in terms of overall verification by generating real CRLs in cmake. Probably worth doing, but looking at redpanda_sign_cert I'm guessing it won't be a quick thing :(

https://redpandadata.atlassian.net/browse/CORE-3144 cc @michael-redpanda

@oleiman
Copy link
Member Author

oleiman commented May 31, 2024

CDT Failures original build:

@oleiman
Copy link
Member Author

oleiman commented May 31, 2024

/cdt
tests/rptest/tests/e2e_shadow_indexing_test.py::EndToEndThrottlingTest.test_throttling
dt-repeat=5

@oleiman
Copy link
Member Author

oleiman commented May 31, 2024

/cdt
tests/rptest/tests/random_node_operations_test.py::RandomNodeOperationsTest.test_node_operations

Copy link
Contributor

@aanthony-rp aanthony-rp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see that the bulk of changes are in tls_config.h with a good deal of tests. Looks clean, I don't really know the context of the rest of the system and what you might be missing, but your test coverage looks good.

@oleiman
Copy link
Member Author

oleiman commented May 31, 2024

OK I see that the bulk of changes are in tls_config.h with a good deal of tests. Looks clean, I don't really know the context of the rest of the system and what you might be missing, but your test coverage looks good.

Yep, that's the general idea. In my view, most of the risk here is to the integration tests themselves, and I think we've covered our bases there.

@oleiman oleiman merged commit 8e7406a into redpanda-data:dev May 31, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants