Skip to content

Commit

Permalink
feat!: monitoring multiple clusters (#266)
Browse files Browse the repository at this point in the history
* feat!: monitoring multiple clusters

Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>

* docs: monitoring multiple clusters

Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>

---------

Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
Co-authored-by: Alex Jones <alexsimonjones@gmail.com>
Co-authored-by: Aris Boutselis <aris.boutselis@senseon.io>
  • Loading branch information
3 people committed Jan 24, 2024
1 parent 0bb5d76 commit 95a67a0
Show file tree
Hide file tree
Showing 7 changed files with 154 additions and 33 deletions.
65 changes: 65 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,71 @@ you will be able to see the Results objects of the analysis after some minutes (
"details": "The error message means that the service in Kubernetes doesn't have any associated endpoints, which should have been labeled with \"control-plane=controller-manager\". \n\nTo solve this issue, you need to add the \"control-plane=controller-manager\" label to the endpoint that matches the service. Once the endpoint is labeled correctly, Kubernetes can associate it with the service, and the error should be resolved.",
```
## Monitor multiple clusters
The `k8sgpt.ai` Operator allows monitoring multiple clusters by providing a `kubeconfig` value.
This feature could be fascinating if you want to embrace Platform Engineering such as running a fleet of Kubernetes clusters for multiple stakeholders.
Especially designed for the Cluster API-based infrastructures, `k8sgpt.ai` Operator is going to be installed in the same Cluster API management cluster:
this one is responsible for creating the required clusters according to the infrastructure provider for the seed clusters.
Once a Cluster API-based cluster has been provisioned a `kubeconfig` according to the naming convention `${CLUSTERNAME}-kubeconfig` will be available in the same namespace:
the conventional Secret data key is `value`, this can be used to instruct the `k8sgpt.ai` Operator to monitor a remote cluster without installing any resource deployed to the seed cluster.
```
$: kubectl get clusters
NAME PHASE AGE VERSION
capi-quickstart Provisioned 8s v1.28.0
$: kubectl get secrets
NAME TYPE DATA AGE
capi-quickstart-kubeconfig Opaque 1 8s
```
> **A security concern**
>
> If your setup requires the least privilege approach,
> a different `kubeconfig` must be provided since the Cluster API generated one is bounded to the `admin` user which has `clustr-admin` permissions.
Once you have a valid `kubeconfig`, a `k8sgpt` instance can be created as it follows.
```yaml
apiVersion: core.k8sgpt.ai/v1alpha1
kind: K8sGPT
metadata:
name: capi-quickstart
namespace: default
spec:
ai:
anonymized: true
backend: openai
language: english
model: gpt-3.5-turbo
secret:
key: api_key
name: my_openai_secret
kubeconfig:
key: value
name: capi-quickstart-kubeconfig
```
Once applied the `k8sgpt.ai` Operator will create the `k8sgpt.ai` Deployment by using the seed cluster `kubeconfig` defined in the field `/spec/kubeconfig`.
The resulting `Result` objects will be available in the same Namespace where the `k8sgpt.ai` instance has been deployed,
accordingly labelled with the following keys:
- `k8sgpts.k8sgpt.ai/name`: the `k8sgpt.ai` instance Name
- `k8sgpts.k8sgpt.ai/namespace`: the `k8sgpt.ai` instance Namespace
- `k8sgpts.k8sgpt.ai/backend`: the AI backend (if specified)
Thanks to these labels, the results can be filtered according to the specified monitored cluster,
without polluting the underlying cluster with the `k8sgpt.ai` CRDs and consuming seed compute workloads,
as well as keeping confidentiality about the AI backend driver credentials.
> In case of missing `/spec/kubeconfig` field, `k8sgpt.ai` Operator will track the cluster on which has been deployed:
> this is possible by mounting the provided `ServiceAccount`.
## Remote Cache
<details>
Expand Down
3 changes: 3 additions & 0 deletions api/v1alpha1/k8sgpt_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,9 @@ type K8sGPTSpec struct {
RemoteCache *RemoteCacheRef `json:"remoteCache,omitempty"`
Integrations *Integrations `json:"integrations,omitempty"`
NodeSelector map[string]string `json:"nodeSelector,omitempty"`
// Define the kubeconfig the Deployment must use.
// If empty, the Deployment will use the ServiceAccount provided by Kubernetes itself.
Kubeconfig *SecretRef `json:"kubeconfig,omitempty"`
}

const (
Expand Down
5 changes: 5 additions & 0 deletions api/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 10 additions & 0 deletions config/crd/bases/core.k8sgpt.ai_k8sgpts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,16 @@ spec:
type: boolean
type: object
type: object
kubeconfig:
description: Define the kubeconfig the Deployment must use. If empty,
the Deployment will use the ServiceAccount provided by Kubernetes
itself.
properties:
key:
type: string
name:
type: string
type: object
noCache:
type: boolean
nodeSelector:
Expand Down
18 changes: 11 additions & 7 deletions controllers/k8sgpt_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,6 @@ import (

corev1alpha1 "github.com/k8sgpt-ai/k8sgpt-operator/api/v1alpha1"

kclient "github.com/k8sgpt-ai/k8sgpt-operator/pkg/client"
"github.com/k8sgpt-ai/k8sgpt-operator/pkg/integrations"
"github.com/k8sgpt-ai/k8sgpt-operator/pkg/resources"
"github.com/k8sgpt-ai/k8sgpt-operator/pkg/sinks"
"github.com/k8sgpt-ai/k8sgpt-operator/pkg/utils"
"github.com/prometheus/client_golang/prometheus"
v1 "k8s.io/api/apps/v1"
kcorev1 "k8s.io/api/core/v1"
Expand All @@ -37,6 +32,12 @@ import (
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
"sigs.k8s.io/controller-runtime/pkg/log"
"sigs.k8s.io/controller-runtime/pkg/metrics"

kclient "github.com/k8sgpt-ai/k8sgpt-operator/pkg/client"
"github.com/k8sgpt-ai/k8sgpt-operator/pkg/integrations"
"github.com/k8sgpt-ai/k8sgpt-operator/pkg/resources"
"github.com/k8sgpt-ai/k8sgpt-operator/pkg/sinks"
"github.com/k8sgpt-ai/k8sgpt-operator/pkg/utils"
)

const (
Expand Down Expand Up @@ -151,7 +152,7 @@ func (r *K8sGPTReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctr
// Check and see if the instance is new or has a K8sGPT deployment in flight
deployment := v1.Deployment{}
err = r.Get(ctx, client.ObjectKey{Namespace: k8sgptConfig.Namespace,
Name: "k8sgpt-deployment"}, &deployment)
Name: k8sgptConfig.Name}, &deployment)
if client.IgnoreNotFound(err) != nil {
k8sgptReconcileErrorCount.Inc()
return r.finishReconcile(err, false)
Expand Down Expand Up @@ -260,7 +261,10 @@ func (r *K8sGPTReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctr
// no longer are relevent, we can do this by using the resultSpec composed name against
// the custom resource name
resultList := &corev1alpha1.ResultList{}
err = r.List(ctx, resultList)
err = r.List(ctx, resultList, client.MatchingLabels(map[string]string{
"k8sgpts.k8sgpt.ai/name": k8sgptConfig.Name,
"k8sgpts.k8sgpt.ai/namespace": k8sgptConfig.Namespace,
}))
if err != nil {
k8sgptReconcileErrorCount.Inc()
return r.finishReconcile(err, false)
Expand Down
2 changes: 1 addition & 1 deletion pkg/client/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ func GenerateAddress(ctx context.Context, cli client.Client, k8sgptConfig *v1alp
// Get service IP and port for k8sgpt-deployment
svc := &corev1.Service{}
err := cli.Get(ctx, client.ObjectKey{Namespace: k8sgptConfig.Namespace,
Name: "k8sgpt"}, svc)
Name: k8sgptConfig.Name}, svc)
if err != nil {
return "", nil
}
Expand Down
84 changes: 59 additions & 25 deletions pkg/resources/k8sgpt.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ package resources
import (
"context"
err "errors"
"fmt"

"github.com/k8sgpt-ai/k8sgpt-operator/api/v1alpha1"
"github.com/k8sgpt-ai/k8sgpt-operator/pkg/utils"
Expand All @@ -29,6 +30,7 @@ import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/client-go/util/retry"
"k8s.io/utils/ptr"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
)
Expand All @@ -39,15 +41,14 @@ type SyncOrDestroy int
const (
SyncOp SyncOrDestroy = iota
DestroyOp
DeploymentName = "k8sgpt-deployment"
)

// GetService Create service for K8sGPT
func GetService(config v1alpha1.K8sGPT) (*corev1.Service, error) {
// Create service
service := corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: "k8sgpt",
Name: config.Name,
Namespace: config.Namespace,
OwnerReferences: []metav1.OwnerReference{
{
Expand All @@ -62,7 +63,7 @@ func GetService(config v1alpha1.K8sGPT) (*corev1.Service, error) {
},
Spec: corev1.ServiceSpec{
Selector: map[string]string{
"app": DeploymentName,
"app": config.Name,
},
Ports: []corev1.ServicePort{
{
Expand Down Expand Up @@ -178,14 +179,14 @@ func GetClusterRole(config v1alpha1.K8sGPT) (*r1.ClusterRole, error) {
}

// GetDeployment Create deployment with the latest K8sGPT image
func GetDeployment(config v1alpha1.K8sGPT) (*appsv1.Deployment, error) {
func GetDeployment(config v1alpha1.K8sGPT, outOfClusterMode bool) (*appsv1.Deployment, error) {

// Create deployment
image := config.Spec.Repository + ":" + config.Spec.Version
replicas := int32(1)
deployment := appsv1.Deployment{
ObjectMeta: metav1.ObjectMeta{
Name: DeploymentName,
Name: config.Name,
Namespace: config.Namespace,
OwnerReferences: []metav1.OwnerReference{
{
Expand All @@ -202,13 +203,13 @@ func GetDeployment(config v1alpha1.K8sGPT) (*appsv1.Deployment, error) {
Replicas: &replicas,
Selector: &metav1.LabelSelector{
MatchLabels: map[string]string{
"app": DeploymentName,
"app": config.Name,
},
},
Template: corev1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: map[string]string{
"app": DeploymentName,
"app": config.Name,
},
},
Spec: corev1.PodSpec{
Expand Down Expand Up @@ -273,6 +274,35 @@ func GetDeployment(config v1alpha1.K8sGPT) (*appsv1.Deployment, error) {
},
},
}
if outOfClusterMode {
// No need of ServiceAccount since the Deployment will use
// a kubeconfig pointing to an external cluster.
deployment.Spec.Template.Spec.ServiceAccountName = ""
deployment.Spec.Template.Spec.AutomountServiceAccountToken = ptr.To(false)

kubeconfigPath := fmt.Sprintf("/tmp/%s", config.Name)

deployment.Spec.Template.Spec.Containers[0].Args = append(deployment.Spec.Template.Spec.Containers[0].Args, fmt.Sprintf("--kubeconfig=%s/kubeconfig", kubeconfigPath))
deployment.Spec.Template.Spec.Containers[0].VolumeMounts = append(deployment.Spec.Template.Spec.Containers[0].VolumeMounts, corev1.VolumeMount{
Name: "kubeconfig",
ReadOnly: true,
MountPath: kubeconfigPath,
})
deployment.Spec.Template.Spec.Volumes = append(deployment.Spec.Template.Spec.Volumes, corev1.Volume{
Name: "kubeconfig",
VolumeSource: v1.VolumeSource{
Secret: &corev1.SecretVolumeSource{
SecretName: config.Spec.Kubeconfig.Name,
Items: []corev1.KeyToPath{
{
Key: config.Spec.Kubeconfig.Key,
Path: "kubeconfig",
},
},
},
},
})
}
if config.Spec.AI.Secret != nil {
password := corev1.EnvVar{
Name: "K8SGPT_PASSWORD",
Expand Down Expand Up @@ -347,35 +377,39 @@ func Sync(ctx context.Context, c client.Client,

var objs []client.Object

svc, er := GetService(config)
if er != nil {
return er
}
outOfClusterMode := config.Spec.Kubeconfig != nil

objs = append(objs, svc)
if !outOfClusterMode {
svcAcc, er := GetServiceAccount(config)
if er != nil {
return er
}

svcAcc, er := GetServiceAccount(config)
if er != nil {
return er
}
objs = append(objs, svcAcc)

objs = append(objs, svcAcc)
clusterRole, er := GetClusterRole(config)
if er != nil {
return er
}

clusterRole, er := GetClusterRole(config)
if er != nil {
return er
}
objs = append(objs, clusterRole)

clusterRoleBinding, er := GetClusterRoleBinding(config)
if er != nil {
return er
}

objs = append(objs, clusterRole)
objs = append(objs, clusterRoleBinding)
}

clusterRoleBinding, er := GetClusterRoleBinding(config)
svc, er := GetService(config)
if er != nil {
return er
}

objs = append(objs, clusterRoleBinding)
objs = append(objs, svc)

deployment, er := GetDeployment(config)
deployment, er := GetDeployment(config, outOfClusterMode)
if er != nil {
return er
}
Expand Down

0 comments on commit 95a67a0

Please sign in to comment.