Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create Pod Disruption Budget for DCA and CCR deployments #1454

Merged
merged 14 commits into from
Nov 8, 2024
4 changes: 4 additions & 0 deletions api/datadoghq/v2alpha1/datadogagent_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -1362,6 +1362,10 @@ type DatadogAgentComponentOverride struct {
// +optional
Replicas *int32 `json:"replicas,omitempty"`

// Set CreatePodDisruptionBudget to true to create a PodDisruptionBudget for this component.
// +optional
CreatePodDisruptionBudget *bool `json:"createPodDisruptionBudget,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we ever get a request for making PDB configurable?! I guess it's unlikely since we haven't made it configurable in helm.


// Set CreateRbac to false to prevent automatic creation of Role/ClusterRole for this component
// +optional
CreateRbac *bool `json:"createRbac,omitempty"`
Expand Down
5 changes: 5 additions & 0 deletions api/datadoghq/v2alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions config/crd/bases/v1/datadoghq.com_datadogagents.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3567,6 +3567,9 @@ spec:
`security-agent`, `system-probe`, `trace-agent`, and `all`.
Configuration under `all` applies to all configured containers.
type: object
createPodDisruptionBudget:
description: Set CreatePodDisruptionBudget to true to create a PodDisruptionBudget for this component.
type: boolean
createRbac:
description: Set CreateRbac to false to prevent automatic creation of Role/ClusterRole for this component
type: boolean
Expand Down
1 change: 1 addition & 0 deletions docs/configuration.v2alpha1.md
Original file line number Diff line number Diff line change
Expand Up @@ -331,6 +331,7 @@ In the table, `spec.override.nodeAgent.image.name` and `spec.override.nodeAgent.
| [key].containers.[key].securityContext.windowsOptions.hostProcess | HostProcess determines if a container should be run as a 'Host Process' container. All of a Pod's containers must have the same effective HostProcess value (it is not allowed to have a mix of HostProcess containers and non-HostProcess containers). In addition, if HostProcess is true then HostNetwork must also be set to true. |
| [key].containers.[key].securityContext.windowsOptions.runAsUserName | The UserName in Windows to run the entrypoint of the container process. Defaults to the user specified in image metadata if unspecified. May also be set in PodSecurityContext. If set in both SecurityContext and PodSecurityContext, the value specified in SecurityContext takes precedence. |
| [key].containers.[key].volumeMounts `[]object` | Specify additional volume mounts in the container. |
| [key].createPodDisruptionBudget | Set CreatePodDisruptionBudget to true to create a PodDisruptionBudget for this component. |
| [key].createRbac | Set CreateRbac to false to prevent automatic creation of Role/ClusterRole for this component |
| [key].customConfigurations `map[string]object` | CustomConfiguration allows to specify custom configuration files for `datadog.yaml`, `datadog-cluster.yaml`, `security-agent.yaml`, and `system-probe.yaml`. The content is merged with configuration generated by the Datadog Operator, with priority given to custom configuration. WARNING: It is possible to override values set in the `DatadogAgent`. |
| [key].customConfigurations.[key].configData | ConfigData corresponds to the configuration file content. |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,16 @@ import (
"github.com/DataDog/datadog-operator/pkg/controller/utils/comparison"

corev1 "k8s.io/api/core/v1"
policyv1 "k8s.io/api/policy/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/intstr"
"k8s.io/apimachinery/pkg/version"
)

const (
pdbMinAvailableInstances = 1
)

// GetClusterAgentService returns the Cluster-Agent service
func GetClusterAgentService(dda metav1.Object) *corev1.Service {
labels := object.GetDefaultLabels(dda, v2alpha1.DefaultClusterAgentResourceSuffix, GetClusterAgentVersion(dda))
Expand Down Expand Up @@ -53,6 +58,27 @@ func GetClusterAgentService(dda metav1.Object) *corev1.Service {
return service
}

func GetClusterAgentPodDisruptionBudget(dda metav1.Object) *policyv1.PodDisruptionBudget {
levan-m marked this conversation as resolved.
Show resolved Hide resolved
// labels and annotations
minAvailableStr := intstr.FromInt(pdbMinAvailableInstances)
matchLabels := map[string]string{
apicommon.AgentDeploymentNameLabelKey: dda.GetName(),
apicommon.AgentDeploymentComponentLabelKey: v2alpha1.DefaultClusterAgentResourceSuffix}
pdb := &policyv1.PodDisruptionBudget{
ObjectMeta: metav1.ObjectMeta{
Name: "datadog-cluster-agent-pdb",
swang392 marked this conversation as resolved.
Show resolved Hide resolved
Namespace: dda.GetNamespace(),
},
Spec: policyv1.PodDisruptionBudgetSpec{
MinAvailable: &minAvailableStr,
Selector: &metav1.LabelSelector{
MatchLabels: matchLabels,
},
},
}
return pdb
}

// GetMetricsServerServiceName returns the external metrics provider service name
func GetMetricsServerServiceName(dda metav1.Object) string {
return fmt.Sprintf("%s-%s", dda.GetName(), v2alpha1.DefaultMetricsServerResourceSuffix)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@

appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
policyv1 "k8s.io/api/policy/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/intstr"

apicommon "github.com/DataDog/datadog-operator/api/datadoghq/common"
"github.com/DataDog/datadog-operator/api/datadoghq/v2alpha1"
Expand All @@ -21,6 +23,10 @@
"github.com/DataDog/datadog-operator/pkg/defaulting"
)

const (
pdMaxUnavailableInstances = 1
swang392 marked this conversation as resolved.
Show resolved Hide resolved
)

// GetClusterChecksRunnerName return the Cluster-Checks-Runner name based on the DatadogAgent name
func GetClusterChecksRunnerName(dda metav1.Object) string {
return fmt.Sprintf("%s-%s", dda.GetName(), v2alpha1.DefaultClusterChecksRunnerResourceSuffix)
Expand Down Expand Up @@ -82,6 +88,27 @@
return template
}

func GetClusterChecksRunnerPodDisruptionBudget(dda metav1.Object) *policyv1.PodDisruptionBudget {
maxUnavailableStr := intstr.FromInt(pdMaxUnavailableInstances)
matchLabels := map[string]string{
apicommon.AgentDeploymentNameLabelKey: dda.GetName(),
apicommon.AgentDeploymentComponentLabelKey: v2alpha1.DefaultClusterChecksRunnerResourceSuffix}
pdb := &policyv1.PodDisruptionBudget{
ObjectMeta: metav1.ObjectMeta{
Name: "datadog-cluster-checks-runner-pdb",
swang392 marked this conversation as resolved.
Show resolved Hide resolved
Namespace: dda.GetNamespace(),
},
Spec: policyv1.PodDisruptionBudgetSpec{
MaxUnavailable: &maxUnavailableStr,
Selector: &metav1.LabelSelector{
MatchLabels: matchLabels,
},
},
}
fmt.Println("in GetClusterAgentPodDisruptionBudget, ", dda.GetNamespace())

Check failure on line 108 in internal/controller/datadogagent/component/clusterchecksrunner/default.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)

Check failure on line 108 in internal/controller/datadogagent/component/clusterchecksrunner/default.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)
return pdb
}

// getDefaultServiceAccountName return the default Cluster-Agent ServiceAccountName
func getDefaultServiceAccountName(dda metav1.Object) string {
return fmt.Sprintf("%s-%s", dda.GetName(), v2alpha1.DefaultClusterChecksRunnerResourceSuffix)
Expand Down
14 changes: 11 additions & 3 deletions internal/controller/datadogagent/controller_reconcile_v2.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@

func (r *Reconciler) internalReconcileV2(ctx context.Context, request reconcile.Request) (reconcile.Result, error) {
reqLogger := r.log.WithValues("datadogagent", request.NamespacedName)
reqLogger.Info("Reconciling DatadogAgent")
reqLogger.Info("Reconciling DatadogAgent1")

// Fetch the DatadogAgent instance
instance := &datadoghqv2alpha1.DatadogAgent{}
Expand All @@ -46,10 +46,12 @@
return result, nil
}
// Error reading the object - requeue the request.
reqLogger.Error(err, "unable to fetch DatadogAgent")
return result, err
}

if instance.Spec.Global == nil || instance.Spec.Global.Credentials == nil {
reqLogger.Info("credentials not configured in the DatadogAgent, can't reconcile")
return result, fmt.Errorf("credentials not configured in the DatadogAgent, can't reconcile")
}

Expand All @@ -74,6 +76,7 @@
}*/

if result, err = r.handleFinalizer(reqLogger, instance, r.finalizeDadV2); utils.ShouldReturn(result, err) {
reqLogger.V(1).Info("finalizer error", "error", err)
return result, err
}

Expand All @@ -88,16 +91,19 @@
// Set default values for GlobalConfig and Features
instanceCopy := instance.DeepCopy()
datadoghqv2alpha1.DefaultDatadogAgent(instanceCopy)

reqLogger.Info("reconcile instance")
return r.reconcileInstanceV2(ctx, reqLogger, instanceCopy)
}

func (r *Reconciler) reconcileInstanceV2(ctx context.Context, logger logr.Logger, instance *datadoghqv2alpha1.DatadogAgent) (reconcile.Result, error) {
var result reconcile.Result
newStatus := instance.Status.DeepCopy()
now := metav1.NewTime(time.Now())

// logger = logger.WithValues("datadogagent", instance.Name, "namespace", instance.Namespace)
fmt.Println("ReconcileInstanceV2")

Check failure on line 103 in internal/controller/datadogagent/controller_reconcile_v2.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)

Check failure on line 103 in internal/controller/datadogagent/controller_reconcile_v2.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)
logger.Info("ReconcileInstanceV2")
features, requiredComponents := feature.BuildFeatures(instance, reconcilerOptionsToFeatureOptions(&r.options, logger))
logger.Info("ReconcileInstanceV2", "features", features)
// update list of enabled features for metrics forwarder
r.updateMetricsForwardersFeatures(instance, features)

Expand All @@ -117,7 +123,9 @@
var errs []error

// Set up dependencies required by enabled features
for _, feat := range features {

Check failure on line 126 in internal/controller/datadogagent/controller_reconcile_v2.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)
fmt.Println("Dependency ManageDependencies", "featureID", feat.ID())

Check failure on line 127 in internal/controller/datadogagent/controller_reconcile_v2.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)
logger.Info("Dependency ManageDependencies", "featureID", feat.ID())
logger.V(1).Info("Dependency ManageDependencies", "featureID", feat.ID())
if featErr := feat.ManageDependencies(resourceManagers, requiredComponents); featErr != nil {
errs = append(errs, featErr)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@
logger logr.Logger
disableNonResourceRules bool
otelAgentEnabled bool
PodDisruptionBudget bool

customConfigAnnotationKey string
customConfigAnnotationValue string
Expand Down Expand Up @@ -123,7 +124,6 @@
if dda.Spec.Global.DisableNonResourceRules != nil && *dda.Spec.Global.DisableNonResourceRules {
f.disableNonResourceRules = true
}

if dda.Spec.Global.Credentials != nil {
creds := dda.Spec.Global.Credentials

Expand Down Expand Up @@ -208,7 +208,6 @@
},
}
}

}

// ManageDependencies allows a feature to manage its dependencies.
Expand Down Expand Up @@ -250,6 +249,8 @@
}

if components.ClusterAgent.IsEnabled() {
f.logger.Info("Cluster Agent is enabled")
fmt.Println("Cluster Agent is enabled")

Check failure on line 253 in internal/controller/datadogagent/feature/enabledefault/feature.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)

Check failure on line 253 in internal/controller/datadogagent/feature/enabledefault/feature.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)
if err := f.clusterAgentDependencies(managers, components.ClusterAgent); err != nil {
errs = append(errs, err)
}
Expand Down
24 changes: 24 additions & 0 deletions internal/controller/datadogagent/override/dependencies.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@
"k8s.io/apimachinery/pkg/util/errors"

"github.com/DataDog/datadog-operator/api/datadoghq/v2alpha1"
componentdca "github.com/DataDog/datadog-operator/internal/controller/datadogagent/component/clusteragent"
componentccr "github.com/DataDog/datadog-operator/internal/controller/datadogagent/component/clusterchecksrunner"
"github.com/DataDog/datadog-operator/internal/controller/datadogagent/feature"
"github.com/DataDog/datadog-operator/internal/controller/datadogagent/object"
"github.com/DataDog/datadog-operator/internal/controller/datadogagent/object/configmap"
Expand Down Expand Up @@ -42,6 +44,28 @@
// Handle custom check files
checksdCMName := fmt.Sprintf(extraChecksdConfigMapName, strings.ToLower((string(component))))
errs = append(errs, overrideExtraConfigs(logger, manager, override.ExtraChecksd, namespace, checksdCMName, false)...)

if override.CreatePodDisruptionBudget != nil {
fmt.Println("override.CreatePodDisruptionBudget is not nil for", component)

Check failure on line 49 in internal/controller/datadogagent/override/dependencies.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)

Check failure on line 49 in internal/controller/datadogagent/override/dependencies.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)
if component == v2alpha1.ClusterAgentComponentName {
pdb := componentdca.GetClusterAgentPodDisruptionBudget(dda)
if err := manager.Store().AddOrUpdate(kubernetes.PodDisruptionBudgetsKind, pdb); err != nil {
fmt.Println("error with created pod disruption budget")

Check failure on line 53 in internal/controller/datadogagent/override/dependencies.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)

Check failure on line 53 in internal/controller/datadogagent/override/dependencies.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)
errs = append(errs, err)
} else {
fmt.Println("created pod disruption budget")

Check failure on line 56 in internal/controller/datadogagent/override/dependencies.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)

Check failure on line 56 in internal/controller/datadogagent/override/dependencies.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)
}

} else if component == v2alpha1.ClusterChecksRunnerComponentName {
pdb := componentccr.GetClusterChecksRunnerPodDisruptionBudget(dda)
if err := manager.Store().AddOrUpdate(kubernetes.PodDisruptionBudgetsKind, pdb); err != nil {
fmt.Println("error with created pod disruption budget")

Check failure on line 62 in internal/controller/datadogagent/override/dependencies.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)

Check failure on line 62 in internal/controller/datadogagent/override/dependencies.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)
errs = append(errs, err)
} else {
fmt.Println("created pod disruption budget")

Check failure on line 65 in internal/controller/datadogagent/override/dependencies.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)

Check failure on line 65 in internal/controller/datadogagent/override/dependencies.go

View workflow job for this annotation

GitHub Actions / build

use of `fmt.Println` forbidden by pattern `^(fmt\.Print(|f|ln)|print|println)$` (forbidigo)
}
}
}
}

return errs
Expand Down
Loading