Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: [Zero-Downtime] Runtime-Watcher TLS configuration management #1507

Open
3 of 5 tasks
Tomasz-Smelcerz-SAP opened this issue Apr 29, 2024 · 2 comments · May be fixed by #1871
Open
3 of 5 tasks

feat: [Zero-Downtime] Runtime-Watcher TLS configuration management #1507

Tomasz-Smelcerz-SAP opened this issue Apr 29, 2024 · 2 comments · May be fixed by #1871
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@Tomasz-Smelcerz-SAP
Copy link
Member

Tomasz-Smelcerz-SAP commented Apr 29, 2024

Description

Zero-Downtime: Implement the Runtime-Watcher TLS configuration renewal logic.

The logic is based on the POC results, but it is simplified: The additional secret object is removed, instead the Istio Gateway secret plays the key role in the migration process.
This is based on the following observation:
We must adjust SKR watcher client TLS configuration if and ONLY IF the Istio Gateway TLS configuration changes
This has an impact for the design of the zero-downtime certificate rotation solution.
The system is designed with two independent components, running asynchronously to each other:

  • The first one observes rotation of the "Root" certificates in KCP and manages the Istio Gateway secret accordingly. It's not related to any particular Kyma or SKR
  • The second one manages SKR watcher client TLS configuration. It generates/updates the relevant secrets in KCP and SKR. It is coupled to the reconciliation of the Kyma CR.

Note: This issue describes the second component

Responsibility 1: Bootstrap

  1. No Watcher TLS secret exists in the KCP
  2. Wait until the Istio Gateway secret is available in the KCP
  3. Create Watcher TLS certificate in the KCP (using certificate CR - the Cert Manager creates the secret)

Responsibility 2: Migration

  1. When both of the following happen:
    • Root certificate is more recent than the Watcher TLS secret in the KCP
    • Istio-Gateway secret is more recent than the Watcher TLS secret in the KCP
  2. Re-generate the Watcher TLS certificate in the KCP (already implemented but triggerred differently)

Responsibility 3: Synchronization

  1. When any of the conditions occur:
    • Watcher TLS configuration is missing in SKR
    • Watcher TLS secret in KCP is more recent than the corresponding secret in the SKR
    • Istio-Gateway secret is more recent than secret in the SKR
  2. Then generate Watcher TLS configuration secret in the SKR, taking the tls.crt and tls.key from the corresponding secret in the KCP, but ca.crt data from the Istio-Gateway secret

Note: Instead of 1. we can also just sync the data with every reconciliation (patch)

Implementation Notes:

  • This logic is tightly coupled with any given SKR, so it looks it can be implemented as the part of current Kyma reconciliation loop.
  • The code for Watcher certificates and secrets generation/renewal and for synchronization of these to the SKR is already implemented. It must be adjusted to account for new requirements.

Reasons

We need a robust, zero-downtime solution for the Watcher TLS certificate rotation

Acceptance Criteria

  • Create a follow-up issue to provide a metric in the SKR that reports the client certificate used for watcher. Then in Plutono we can see how the migration process works.
  • Implement the solution along with necessary unit and integration tests
  • Update the documentation
  • Manually test the rotation logic
  • Update the existing e2e test and add a new one if necessary.

Feature Testing

Testing approach

unit tests, integration tests, e2e test(s)
Existing tests:

Attachments

watcher-certificate-migration3

Related Issues

#1430

@Tomasz-Smelcerz-SAP Tomasz-Smelcerz-SAP added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 29, 2024
@LeelaChacha LeelaChacha self-assigned this Sep 4, 2024
LeelaChacha added a commit to LeelaChacha/kyma-lifecycle-manager that referenced this issue Sep 29, 2024
@LeelaChacha
Copy link
Contributor

Blocked by: #1890

@LeelaChacha
Copy link
Contributor

@Tomasz-Smelcerz-SAP is this issue still relevant. I believe that it has been covered in #1890

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants