Skip to content

Commit

Permalink
update with comments
Browse files Browse the repository at this point in the history
Signed-off-by: Mario Fernandez <mariofer@redhat.com>
  • Loading branch information
marioferh committed Aug 2, 2024
1 parent 81e8830 commit d5b2dac
Showing 1 changed file with 79 additions and 42 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ last-updated: 2024-04-26
status: provisional
---

# CRD Based CMO
# CRD ClusterMonitoring

## Release Signoff Checklist

Expand All @@ -45,78 +45,114 @@ status: provisional
* End users get a much faster feedback loop. No more applying the config and scanning logs if things don't look right. The API server will give immediate feedback
* Organizational users (such as ACM) can manage a single resource and observe its status


### Goals

- Replace configmaps with CRD
- Smooth transition for users

## Proposal

## Design Details

To initiate the process, let's establish a feature gate that will serve as the entry point for implementing a CRD configuration approach. This strategy enables us to make incremental advancements without the immediate burden of achieving complete feature equivalence with the config map. We can commence with a the basics and progressively incorporate additional functionalities as they develop.

One proposal for a minimal DoD was:
- Feature gate
- CRD Initial dev https://github.com/openshift/cluster-monitoring-operator/pull/2347/
Add controller-gen logic to makefile
Add API to pkg/apis/cmo/v1
Add Generated CRD: config/crd/bases/example.com_clustermonitoringoperators.yaml
Add example CustomResource: config/examples/clustermonitoringoperator.yaml
- Client codegen: https://github.com/openshift/cluster-monitoring-operator/pull/2369
- Reconcile logic: https://github.com/openshift/cluster-monitoring-operator/pull/2350
- Add decoupling Confimgap / CustomResource:
Controller logic is strongly dependant of *manifests.Config struct.



### Overview

- Replace confimgaps with CRD:
cluster-monitoring-configmap
user-workload-monitoring-config

pkg/manifests/config.go

```
type Config struct {
Images *Images `json:"-"`
RemoteWrite bool `json:"-"`
CollectionProfilesFeatureGateEnabled bool `json:"-"`
ClusterMonitoringConfiguration *ClusterMonitoringConfiguration `json:"-"`
UserWorkloadConfiguration *UserWorkloadConfiguration `json:"-"`
}
```

### Migration path

Feature gate.
Switch mecanishm?
What to do if there are both CRD and confimgap?
Precedence CRD over configmap?
Should we compare CRD and configmap to check differences?
- Feature gate.
- Move each component to openshift/api/config
- Create two CRD's:
* ClusterMonitoringConfiguration
* UserWorkloadConfiguration
The reason for having two different CRDs is to address the needs of two distinct roles who manage them. In managed OpenShift clusters (OSD/ROSA), the platform SREs are responsible for the cluster (platform) monitoring stack, while the platform users (e.g., customers) handle the user-defined monitoring stack.º
- Review/improve/modify every configmap component to create a compliant openshif/api type. Fix erros, clean old things, improve names and config.
- Switch mecanishm
- Both ConfigMap and CRD will coexist for a while until the transition is complete.
- CRD and ConfigMap will be merged into a single structure, this structure will config CMO
- Extensive testing will be required to validate the new behavior.
- If there are different fields in CRD and ConfigMap, the field in the ConfigMap will take precedence.
- Some fields in CRD and ConfigMap diverge due to API requirements. Helper functions will be needed to resolve this.
- The end user will be able to choose to use ConfigMap or CRD at any time, keeping in mind that the fields in the CRD will gradually expand.


### Issues

- Decoupling Confimgap / CustomResource:
Controller logic is strongly dependant of *manifests.Config struct.
Should we translate CR into confimap?

- Correct name for apiVersion? In monitoring.coreos.com/v1 are all prometheus operator components, should we create a new one?
- Merge CRD and configmap could be error prone and will need extra test
- Change of types and names and how handle it when there are CRD and configmap

- Refactor and clean up operator.go client.go manifests.config
### Transition to the user

- How the user could adopt CRD instead of configmap.

### Transition to the user
## Design Details

- How the user could adopt CR instead of configmap.
To initiate the process, let's establish a feature gate that will serve as the entry point for implementing a CRD configuration approach. This strategy enables us to make incremental advancements without the immediate burden of achieving complete feature equivalence with the config map. We can commence with a the basics and progressively incorporate additional functionalities as they develop.

One proposal for a minimal DoD was:
- Feature gate in openshift/api
- Api types moved to openshift/api
- CRD Initial dev https://github.com/openshift/api/pull/1929
Add controller-gen logic to makefile
Add API to api/config/v1
Add Generated CRD: config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitoring.crd.yaml
Add example CustomResource: config/examples/clustermonitoringoperator.yaml
- Client codegen
- Reconcile logic: https://github.com/openshift/cluster-monitoring-operator/pull/2350
- Add decoupling Confimgap / CustomResource:
Controller logic is strongly dependant of *manifests.Config struct.


### Example configuration


#### CRD
#### current configmap

apiVersion: cmo.example.com/v1
```
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
telemeterClient:
enabled: false
prometheusK8s:
volumeClaimTemplate:
metadata:
name: prometheus-data
annotations:
openshift.io/cluster-monitoring-drop-pvc: "yes"
spec:
resources:
requests:
storage: 20Gi
```

### CRD

```
apiVersion: clustermonitoring.config.openshift.io
kind: ClusterMonitoring
metadata:
name: cluster
namespace: openshift-monitoring
spec:
telemeterClient:
enabled: true
nodeSelector:
kubernetes.io/os: linux
tolerations:
- operator: Exists
prometheusK8s:
volumeClaimTemplate:
metadata:
Expand All @@ -126,7 +162,8 @@ spec:
spec:
resources:
requests:
storage: 20Gi
storage: 20G`
```


### Test Plan
Expand Down

0 comments on commit d5b2dac

Please sign in to comment.