Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 Add variable discovery to topology mutation proposal #7932

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,20 @@ Please note Runtime SDK is an advanced feature. If implemented incorrectly, a fa

## Introduction

The Topology Mutation Hooks are going to be called during each Cluster topology reconciliation. More specifically
we are going to call two different hooks for each reconciliation:
Three different hooks are called as part of Topology Mutation - two in the Cluster topology reconciler and one in the ClusterClass reconciler.

**Cluster topology reconciliation**
* **GeneratePatches**: GeneratePatches is responsible for generating patches for the entire Cluster topology.
* **ValidateTopology**: ValidateTopology is called after all patches have been applied and thus allow to validate
the resulting objects.

**ClusterClass reconciliation**
* **DiscoverVariables**: DiscoverVariables is responsible for providing variable definitions for a specific external patch.

![Cluster topology reconciliation](../../../images/runtime-sdk-topology-mutation.png)

Please see the corresponding [CAEP](https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20220330-topology-mutation-hook.md)
for additional background information.
for additional background information.

## Inline vs. external patches

Expand All @@ -35,11 +38,121 @@ External patches have the following advantages:
* External patches can use external data (e.g. from cloud APIs) during patch generation.
* External patches can be easily reused across ClusterClasses.

## External variable definitions
The DiscoverVariables hook can be used to supply variable definitions for use in external patches. These variable definitions are added to
the status of any applicable ClusterClasses. Clusters using the ClusterClass can then set values for those variables.

### External variable discovery in the ClusterClass
External variable definitions are discovered by calling the DiscoverVariables runtime hook. This hook is called from the ClusterClass reconciler.
Once discovered the variable definitions are validated and stored in ClusterClass status.

```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: ClusterClass
# metadata
spec:
# Inline variable definitions
variables:
# This variable is unique and can be accessed globally.
- name: no-proxy
required: true
schema:
openAPIV3Schema:
type: string
default: "internal.com"
example: "internal.com"
description: "comma-separated list of machine or domain names excluded from using the proxy."
# This variable is also defined by an external DiscoverVariables hook.
- name: http-proxy
schema:
openAPIV3Schema:
type: string
default: "proxy.example.com"
example: "proxy.example.com"
description: "proxy for http calls."
# External patch definitions.
patches:
- name: lbImageRepository
external:
generateExtension: generate-patches.k8s-upgrade-with-runtimesdk
validateExtension: validate-topology.k8s-upgrade-with-runtimesdk
## Call variable discovery for this patch.
discoverVariablesExtension: discover-variables.k8s-upgrade-with-runtimesdk
status:
# observedGeneration is used to check that the current version of the ClusterClass is the same as that when the Status was previously written.
# if metadata.generation isn't the same as observedGeneration Cluster using the ClusterClass should not reconcile.
observedGeneration: xx
# variables contains a list of all variable definitions, both inline and from external patches, that belong to the ClusterClass.
variables:
- name: no-proxy
definitions:
- namespace: inline
required: true
schema:
openAPIV3Schema:
type: string
default: "internal.com"
example: "internal.com"
description: "comma-separated list of machine or domain names excluded from using the proxy."
- name: http-proxy
definitions:
- namespace: inline
schema:
openAPIV3Schema:
type: string
default: "proxy.example.com"
example: "proxy.example.com"
description: "proxy for http calls."
- namespace: lbImageRepository
schema:
openAPIV3Schema:
type: string
default: "different.example.com"
example: "different.example.com"
description: "proxy for http calls."
```

### Variable namespacing
Variable definitions can be inline in the ClusterClass or from any number of external DiscoverVariables hooks. The source
of a variable definition is recorded in the `namespace` field in ClusterClass `.status.variables`.
Variables that are defined by an external DiscoverVariables hook will have the name of the patch they are associated with as their namespace.
Variables that are defined in the ClusterClass `.spec.variables` will have the namespace `inline`.
Note: `inline` is a reserved namespace. It can not be used as the name of an external patch to avoid conflicts.

If all variables that share a name have equivalent schemas the variables are considered `global` . `global` variables can
be set without providing a namespace - [see below](#setting-values-for-variables-in-the-cluster). The CAPI components will
consider variable definitions to be equivalent when they share a name and their schema is exactly equal.

### Setting values for variables in the Cluster
Setting variables that are defined with external variable definitions requires attention to be paid to variable namespacing, as exposed in the ClusterClass status.
Variable values are set in Cluster `.spec.topology.variables`.

```yaml
killianmuldoon marked this conversation as resolved.
Show resolved Hide resolved
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
#metadata
spec:
topology:
variables:
# namespace is not needed as this variable is global.
- name: no-proxy
value: "internal.domain.com"
# namespaced variables require values for each individual schema.
- name: http-proxy
namespace: inline
value: http://proxy.example2.com:1234
- name: http-proxy
namespace: lbImageRepository
value:
host: proxy.example2.com
port: 1234
```

## Using one or multiple external patch extensions

Some considerations:
* In general a single external patch extension is simpler than many, as only one extension
then has to be built, deployed and managed.
then has to be built, deployed and managed.
* A single extension also requires less HTTP round-trips between the CAPI controller and the extension(s).
* With a single extension it is still possible to implement multiple logical features using different variables.
* When implementing multiple logical features in one extension it's recommended that they can be conditionally
Expand All @@ -50,8 +163,9 @@ Some considerations:
## Guidelines

For general Runtime Extension developer guidelines please refer to the guidelines in [Implementing Runtime Extensions](implement-extensions.md#guidelines).
This section outlines considerations specific to Topology Mutation hooks:
This section outlines considerations specific to Topology Mutation hooks.

### Patch extension guidelines
* **Input validation**: An External Patch Extension must always validate its input, i.e. it must validate that
all variables exist, have the right type and it must validate the kind and apiVersion of the templates which
should be patched.
Expand All @@ -68,9 +182,19 @@ This section outlines considerations specific to Topology Mutation hooks:
* **Avoid Dependencies**: An External Patch Extension must be independent of other External Patch Extensions. However
if dependencies cannot be avoided, it is possible to control the order in which patches are executed via the ClusterClass.
* **Error messages**: For a given request (a set of templates and variables) an External Patch Extension must
always return the same error message. Otherwise the system might became unstable due to controllers being overloaded
always return the same error message. Otherwise the system might become unstable due to controllers being overloaded
by continuous changes to Kubernetes resources as these messages are reported as conditions. See [error messages](implement-extensions.md#error-messages).

### Variable discovery guidelines
* **Distinctive variable names**: Names should be carefully chosen, and if possible generic names should be avoided.
Using a generic name could lead to conflicts if the variables defined for this patch are used in combination with other
patches providing variables with the same name.
* **Avoid breaking changes to variable definitions**: Changing a variable definition can lead to problems on existing
clusters because reconciliation will stop if variable values do not match the updated definition. When more than one variable
with the same name is defined, changes to variable definitions can require explicit values for each patch.
Updates to the variable definition should be carefully evaluated, and very well documented in extension release notes,
so ClusterClass authors can evaluate impacts of changes before performing an upgrade.

## Definitions

### GeneratePatches
Expand Down Expand Up @@ -192,6 +316,81 @@ function openSwaggerUI() {
}
</script>

### DiscoverVariables

A DiscoverVariables call returns definitions for one or more variables.

#### Example Request:

* The request is a simple call to the Runtime hook.

```yaml
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
kind: DiscoverVariablesRequest
settings: <Runtime Extension settings>
```

#### Example Response:

```yaml
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
kind: DiscoverVariablesResponse
status: Success # or Failure
message: ""
variables:
- name: etcdImageTag
required: true
schema:
openAPIV3Schema:
type: string
default: "3.5.3-0"
example: "3.5.3-0"
description: "etcdImageTag sets the tag for the etcd image."
- name: preLoadImages
required: false
schema:
openAPIV3Schema:
default: []
type: array
items:
type: string
description: "preLoadImages sets the images for the docker machines to preload."
- name: podSecurityStandard
required: false
schema:
openAPIV3Schema:
type: object
properties:
enabled:
type: boolean
default: true
description: "enabled enables the patches to enable Pod Security Standard via AdmissionConfiguration."
enforce:
type: string
default: "baseline"
description: "enforce sets the level for the enforce PodSecurityConfiguration mode. One of privileged, baseline, restricted."
audit:
type: string
default: "restricted"
description: "audit sets the level for the audit PodSecurityConfiguration mode. One of privileged, baseline, restricted."
warn:
type: string
default: "restricted"
description: "warn sets the level for the warn PodSecurityConfiguration mode. One of privileged, baseline, restricted."
...
```

For additional details, you can see the full schema in <button onclick="openSwaggerUI()">Swagger UI</button>.
TODO: Add openAPI definition to the SwaggerUI
killianmuldoon marked this conversation as resolved.
Show resolved Hide resolved
<script>
// openSwaggerUI calculates the absolute URL of the RuntimeSDK YAML file and opens Swagger UI.
function openSwaggerUI() {
var schemaURL = new URL("runtime-sdk-openapi.yaml", document.baseURI).href
window.open("https://editor.swagger.io/?url=" + schemaURL)
}
</script>


## Dealing with Cluster API upgrades with apiVersion bumps

There are some special considerations regarding Cluster API upgrades when the upgrade includes a bump
Expand Down
28 changes: 22 additions & 6 deletions docs/proposals/20220330-topology-mutation-hook.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@ title: Topology Mutation Hook
authors:
- "@sbueringer"
- "@fabriziopandini"
- "@killianmuldoon"
reviewers:
- "@CecileRobertMichon"
- "@enxebre"
- "@vincepri"
- "@killianmuldoon"
- "@ykakarap"
creation-date: 2022-03-30
last-updated: 2022-03-30
Expand Down Expand Up @@ -58,8 +58,11 @@ Refer to the [Cluster API Book Glossary](https://cluster-api.sigs.k8s.io/referen

- **Inline patches**: are defined inline in a ClusterClass and implemented by the core CAPI controller.
- **External patches**: are patches generated by an external component.
- **Topology Mutation Hook**: is the hook defined in this proposal that allows users to plug in an external component that generates patches.
- **Topology Mutation Hook**: is a hook defined in this proposal that allows users to plug in an external component that generates patches.
- **External patch extension**: is an external component that generates patches.
- **Inline variables**: are variables defined inline in a ClusterClass.
- **External variables**: are variables defined by an external component.
- **Variable Discovery Hook**: is a hook defined in this proposal that allows an external component to supply variable definitions.

## Summary

Expand Down Expand Up @@ -96,7 +99,6 @@ The main idea behind Topology Mutation Hook is to move the complexity that is cu

### Future work

* Explore a solution how External Patch Extensions can bring their own variable definitions to shift the responsibility of variable definition and management from ClusterClass authors to External Patch Extension authors. For now it’s the responsibility of the ClusterClass author.
* Explore a solution to detect and prevent an External Patch Extension to trigger infinite reconciles


Expand All @@ -115,11 +117,12 @@ As an External Patch Extension developer:
* I want to unit test the code/logic which generates external patches.
* I want to be able to generate external patches in either JSON Patch or JSON Merge Patch format.
* I want to generate external patches based on external data, for example by querying a cloud API.
* I want to supply the variable definitions, including schema and defaulting rules, for variables used in external patches.
* I want to validate the templates after all patches have been applied, so I can be sure that other External Patch Extensions didn't overwrite my changes.

### Cluster Operator guide

As a Cluster operator, to use ClusterClasses with an External Patch Extensions you have to deploy and register it. You can find the full documentation on how to deploy a Runtime Extension in the [Runtime SDK proposal](https://github.com/kubernetes-sigs/cluster-api/blob/75b39db545ae439f4f6203b5e07496d3b0a6aa75/docs/proposals/20220221-runtime-SDK.md#deploy-runtime-extensions).
As a Cluster operator, to use ClusterClasses with an External Patch Extension you have to deploy and register it. You can find the full documentation on how to deploy a Runtime Extension in the [Runtime SDK proposal](https://github.com/kubernetes-sigs/cluster-api/blob/75b39db545ae439f4f6203b5e07496d3b0a6aa75/docs/proposals/20220221-runtime-SDK.md#deploy-runtime-extensions).

An External Patch Extension can be registered by applying:
```yaml
Expand All @@ -138,7 +141,7 @@ Once the extension is registered the discovery hook is called and the Extension

### ClusterClass author guide

A ClusterClass author can use an External Patch Extension by referencing it in a ClusterClass and adding the corresponding variable definitions.
A ClusterClass author can use an External Patch Extension by referencing it in a ClusterClass.

A ClusterClass can have external patches, inline patches or both. The patches will then be applied in the order in which they are defined. The extension fields of the external patch must match the unique name of RuntimeExtensions assigned during discovery.

Expand All @@ -153,14 +156,18 @@ spec:
- name: external-patch-1
external:
generateExtension: "http-proxy.my-awesome-patch"
discoverVariablesExtension: "variables.my-awesome-patch"
validateExtension: "http-proxy-validate.my-awesome-patch"
# inline patch
- name: region
definitions:
...
```

If the External Patch Extension requires variable definitions, they have to be added to ClusterClass.spec.variables. It is up to the External Patch Extension developer to document them including their OpenAPI schema. As future work we will explore a solution to discover variable definitions from External Patch Extensions automatically.
If the External Patch Extension requires variable definitions they must be defined and supplied using a Variable Discovery Hook. It is up to the External Patch Extension developer to define the variables, including their OpenAPI schema.

Note: In a previous version of this proposal variables defined inline in the ClusterClass `.spec` could be used in external patches.
With the introduction of Variable Discovery variables used in an external patch must come from an associated DiscoverVariables hook.

### Developer guide

Expand Down Expand Up @@ -219,6 +226,15 @@ Mitigations:
* External Patch Extension developers should ensure fast responses under all circumstances.
* Cluster operators can set a timeout on the RuntimeExtensionConfiguration to ensure Cluster topology reconciliation for all Clusters is not slowed down by one slow External Patch Extension. This only helps if the slow External Patch Extension is not used for all Clusters.

#### Clashing external variable definitions
Variable definitions supplied externally by an External Patch Extension through a Variable Discovery Hook can change when the definition in the External Patch Extension changes. This can lead to a clash where variables that previously had the same name and definition no longer have the same definition.

Mitigations:
* Variable Discovery Hooks allow addressing variables using namespacing, where the variable value setting in the Cluster
includes the name of the Patch as a namespace.
* ClusterClass authors should pro-actively test any changes to ClusterClasses and associated Runtime Extensions to avoid clashing variable definitions.
* External Patch extension authors should extensively document their patches, variables and their usage.

## Alternatives

### Extending inline patches vs. introducing external patches
Expand Down