Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Orch] Add CRD RC handler #1340

Merged
merged 41 commits into from
Oct 22, 2024
Merged

[Orch] Add CRD RC handler #1340

merged 41 commits into from
Oct 22, 2024

Conversation

JLineaweaver
Copy link
Contributor

@JLineaweaver JLineaweaver commented Aug 2, 2024

What does this PR do?

Adds remote config support for custom resources in the operator. This new code will merge the user defined configuration with the configuration stored in remote config. Operator users with remote config enabled should no longer require a custom operator configuration to send custom resources.

Motivation

What inspired you to submit this pull request?

Additional Notes

Anything else we should know when reviewing?

Minimum Agent Versions

Are there minimum versions of the Datadog Agent and/or Cluster Agent required?

  • Cluster Agent: v7.42

Describe your test plan

  • RC is only available on datad0g.com. Update operator to use staging api key and app key. Update DD_SITE to datad0g.com.
  • Make sure remoteConfig is enabled on the Operator deployment -remoteConfigEnabled=true in the args of the container
  • Make sure operator has list and watch permission on * / *
  • Update DatadogAgent to include
    orchestratorExplorer:
      enabled: true
    remoteConfiguration:
      enabled: true
  • Turn on/off custom resources indexing in UI
  • You should be able to see this section in status of DatadogAgent
  remoteConfigConfiguration:
    features:
      orchestratorExplorer:
        customResources:
        - datadoghq.com/v2alpha1/datadogagents
        - mutations.gatekeeper.sh/v1/assign
        - kueue.x-k8s.io/v1beta1/admissionchecks
        - test.com/v1/josh
        ........

Checklist

  • PR has at least one valid label: bug, enhancement, refactoring, documentation, tooling, and/or dependencies
  • PR has a milestone or the qa/skip-qa label

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request does not contain a valid label. Please add one of the following labels: bug, enhancement, refactoring, documentation, tooling, dependencies

@JLineaweaver JLineaweaver added the enhancement New feature or request label Aug 2, 2024
@codecov-commenter
Copy link

codecov-commenter commented Aug 2, 2024

Codecov Report

Attention: Patch coverage is 9.81595% with 147 lines in your changes missing coverage. Please review.

Project coverage is 48.65%. Comparing base (d5201ff) to head (8a16d90).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/remoteconfig/orchestrator_k8s_crd.go 0.00% 92 Missing ⚠️
pkg/remoteconfig/updater.go 0.00% 48 Missing ⚠️
...tadogagent/feature/orchestratorexplorer/feature.go 69.56% 7 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1340      +/-   ##
==========================================
- Coverage   48.88%   48.65%   -0.23%     
==========================================
  Files         223      224       +1     
  Lines       19728    19850     +122     
==========================================
+ Hits         9644     9658      +14     
- Misses       9577     9685     +108     
  Partials      507      507              
Flag Coverage Δ
unittests 48.65% <9.81%> (-0.23%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...tadogagent/feature/orchestratorexplorer/feature.go 77.48% <69.56%> (-1.75%) ⬇️
pkg/remoteconfig/updater.go 0.00% <0.00%> (ø)
pkg/remoteconfig/orchestrator_k8s_crd.go 0.00% <0.00%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5201ff...8a16d90. Read the comment docs.

@JLineaweaver JLineaweaver added this to the v1.9.0 milestone Aug 9, 2024
CoreAgent *CoreAgentFeaturesConfig `json:"config,omitempty"`
SystemProbe *SystemProbeFeaturesConfig `json:"system_probe,omitempty"`
SecurityAgent *SecurityAgentFeaturesConfig `json:"security_agent,omitempty"`
CRDs *CustomResourceDefinitionURLs `json:"crds,omitempty"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take this with a grain of salt since I'm not too familiar with the operator, but this looks to me like structurally, configs are broken out by agent binary. Should the CRD configs be wrapped inside a ClusterAgentFeatureConfig type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a great call. I think this is a good idea for future work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@levan-m levan-m modified the milestones: v1.9.0, v1.10.0 Aug 26, 2024
@@ -137,7 +146,7 @@ func (r *RemoteConfigUpdater) Start(apiKey string, site string, clusterName stri
"",
r.serviceConf.baseRawURL,
r.serviceConf.hostname,
[]string{fmt.Sprintf("cluster_name:%s", r.serviceConf.clusterName)},
r.getTags,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a required breaking change

@JLineaweaver JLineaweaver force-pushed the jlineaweaver/cap-1652 branch 2 times, most recently from 72892db to 994024f Compare September 20, 2024 13:52
@levan-m levan-m modified the milestones: v1.11.0, v1.10.0 Oct 16, 2024
@kangyili
Copy link
Contributor

/merge

@dd-devflow
Copy link

dd-devflow bot commented Oct 18, 2024

🚂 MergeQueue: waiting for PR to be ready

This merge request is not mergeable yet, because of pending checks/missing approvals. It will be added to the queue as soon as checks pass and/or get approvals.
Note: if you pushed new commits since the last approval, you may need additional approval.
You can remove it from the waiting list with /remove command.

Use /merge -c to cancel this operation!

@kangyili
Copy link
Contributor

/remove

@dd-devflow
Copy link

dd-devflow bot commented Oct 18, 2024

🚂 Devflow: /remove

@dd-devflow
Copy link

dd-devflow bot commented Oct 18, 2024

⚠️ MergeQueue: This merge request was unqueued

This merge request was unqueued

If you need support, contact us on Slack #devflow!

@@ -65,6 +65,8 @@ github.com/DataDog/datadog-agent/pkg/proto v0.55.0-rc.10 h1:ERkVmUoDPttyVKSCJM1f
github.com/DataDog/datadog-agent/pkg/proto v0.55.0-rc.10/go.mod h1:gHkSUTn6H6UEZQHY3XWBIGNjfI3Tdi0IxlrxIFBWDwU=
github.com/DataDog/datadog-agent/pkg/remoteconfig/state v0.55.0-rc.10 h1:nwJ2JWfjCmf6tpJD1RYHh4JV5HO2Njg6smxOGi8MyOE=
github.com/DataDog/datadog-agent/pkg/remoteconfig/state v0.55.0-rc.10/go.mod h1:3yFk56PJ57yS1GqI9HAsS4PSlAeGCC9RQA7jxKzYj6g=
github.com/DataDog/datadog-agent/pkg/remoteconfig/state v0.59.0-rc.1 h1:CIXKFvUsp5zgE+egvx+QrRTw8r54FMPZVu+z25UNqWs=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there any changes between rc1 and rc4?

Copy link
Contributor

@kangyili kangyili Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

// Merge configs
var finalConfig DatadogAgentRemoteConfig
Copy link
Contributor

@celenechang celenechang Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if it would be clearer to return ClusterAgentFeaturesConfig instead of the entire DatadogAgentRemoteConfig. i got nervous for a moment that we were going to overwrite other parts of the status

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it follows the default pattern, which always returns the entire configuration.

func (r *RemoteConfigUpdater) parseReceivedUpdates(updates map[string]state.RawConfig, applyStatus func(string, state.ApplyStatus)) (DatadogAgentRemoteConfig, error) {

Ultimately, we need to adhere to the function getAndUpdateDatadogAgentWithRetry, which takes the complete configuration and an update function as its input.

func (r *RemoteConfigUpdater) getAndUpdateDatadogAgentWithRetry(ctx context.Context, cfg DatadogAgentRemoteConfig, f func(v2alpha1.DatadogAgent, DatadogAgentRemoteConfig) error) error {

@@ -63,6 +65,7 @@ type DatadogAgentRemoteConfig struct {
ID string `json:"id,omitempty"`
Name string `json:"name,omitempty"`
CoreAgent *CoreAgentFeaturesConfig `json:"config,omitempty"`
ClusterAgent *ClusterAgentFeaturesConfig `json:"cluster_agent,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This structure is unmarshalled from the RawConfig here and

var configData DatadogAgentRemoteConfig
if err := json.Unmarshal(c.Config, &configData); err != nil {

And this config won't contain ClusterAgentFeatureConfig so curious why is this added here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct has several updaters, and we have the CRD updater running after the agent updater, where we populate the ClusterAgent field.

rcClient.Subscribe(string(state.ProductOrchestratorK8sCRDs), r.crdConfigUpdateCallback)

if finalConfig.ClusterAgent == nil {
finalConfig.ClusterAgent = &ClusterAgentFeaturesConfig{}
}
if finalConfig.ClusterAgent.CRDs == nil {
finalConfig.ClusterAgent.CRDs = &CustomResourceDefinitionURLs{}
}
finalConfig.ClusterAgent.CRDs.Crds = &crds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if my question wasn't clear. I'm confused why the ClusterAgentFeaturesConfig is added here.

This structure

type DatadogAgentRemoteConfig struct {
ID string `json:"id,omitempty"`
Name string `json:"name,omitempty"`
CoreAgent *CoreAgentFeaturesConfig `json:"config,omitempty"`
ClusterAgent *ClusterAgentFeaturesConfig `json:"cluster_agent,omitempty"`
SystemProbe *SystemProbeFeaturesConfig `json:"system_probe,omitempty"`
SecurityAgent *SecurityAgentFeaturesConfig `json:"security_agent,omitempty"`
}

is for RC product ProductAgentConfig supporting CoreAgent, SystemProbe and SecurityAgent features. It's subscribed here

rcClient.Subscribe(string(state.ProductAgentConfig), r.agentConfigUpdateCallback)

While

rcClient.Subscribe(string(state.ProductOrchestratorK8sCRDs), r.crdConfigUpdateCallback)

subscribes a different ProductOrchestratorK8sCRDs. So I'm curious why it's put under DatadogAgentRemoteConfig structure instead of having it's own structure?

if finalConfig.ClusterAgent.CRDs == nil {
finalConfig.ClusterAgent.CRDs = &CustomResourceDefinitionURLs{}
}
finalConfig.ClusterAgent.CRDs.Crds = &crds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current implementation for security products unmarshals configuration_order and uses to merge multiple configs:

if c.Metadata.ID == "configuration_order" {
if err := json.Unmarshal(c.Config, &order); err != nil {

do we need similar logic for CRDs too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the agentConfigUpdateCallback, we have several components that need to be updated and merged in a specific order.

rcClient.Subscribe(string(state.ProductAgentConfig), r.agentConfigUpdateCallback)
rcClient.Subscribe(string(state.ProductOrchestratorK8sCRDs), r.crdConfigUpdateCallback)

But for CRD, we only need to update its specific section, so maintaining an order isn't necessary in this case.

@levan-m levan-m merged commit ae5b9e1 into main Oct 22, 2024
19 checks passed
@levan-m levan-m deleted the jlineaweaver/cap-1652 branch October 22, 2024 15:23
levan-m added a commit that referenced this pull request Oct 22, 2024
* [Orch] Add CRD RC handler

* Remove print statement

* Refactor to separate file

* Move last function

* Use CRD specific status

* Add cluster agent config to RC

* Correctly set product

* Fix product and add logging

* Add fixes for crd nil pointers

* Revert accidental commit

* Update dependencies and add tag getter function

* Go.mod change

* Reset go.mod

* Update remoteconfig/state

* Fix updater package and work sum

* Clean up logs and force restart of DCA on CR changes

* Add a lock around get and update of DDA

* Improve comments and change test to use orchexp for annotation

* Change to not do annotations every single time

* Fill in orchestrator explorer for tests

* Go mod update

* Modify retry logic so it doesn't it for the entire update

* Fix go.mod

* Check in config crd stuff

* Remove hard coded product and update go.mod

* Revert go.mod back and fix errors

* Go.sum update

* overwrite cr by incoming data instead of appending to the old data (#1473)

* feedback

* feedback

* rename to OrchestratorK8sCRDRemoteConfig

---------

Co-authored-by: Kangyi LI <kangyi.li@datadoghq.com>
Co-authored-by: levan-m <116471169+levan-m@users.noreply.github.com>
@levan-m levan-m mentioned this pull request Oct 22, 2024
2 tasks
levan-m added a commit that referenced this pull request Oct 22, 2024
* [Orch] Add CRD RC handler

* Remove print statement

* Refactor to separate file

* Move last function

* Use CRD specific status

* Add cluster agent config to RC

* Correctly set product

* Fix product and add logging

* Add fixes for crd nil pointers

* Revert accidental commit

* Update dependencies and add tag getter function

* Go.mod change

* Reset go.mod

* Update remoteconfig/state

* Fix updater package and work sum

* Clean up logs and force restart of DCA on CR changes

* Add a lock around get and update of DDA

* Improve comments and change test to use orchexp for annotation

* Change to not do annotations every single time

* Fill in orchestrator explorer for tests

* Go mod update

* Modify retry logic so it doesn't it for the entire update

* Fix go.mod

* Check in config crd stuff

* Remove hard coded product and update go.mod

* Revert go.mod back and fix errors

* Go.sum update

* overwrite cr by incoming data instead of appending to the old data (#1473)

* feedback

* feedback

* rename to OrchestratorK8sCRDRemoteConfig

---------

Co-authored-by: Joshua Lineaweaver <JLineaweaver@gmail.com>
Co-authored-by: Kangyi LI <kangyi.li@datadoghq.com>
Co-authored-by: Fanny Jiang <fanny.jiang@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants