-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datadog Integration #3407
Datadog Integration #3407
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Nate! This looks pretty good and is right on par with what I'd expect as your first PR related to the helm chart changes. A lot of the changes I am requesting are similar to what was request of me when I did my first batch of helm changes.
Most of it comes down to the names of things, adding toggle checks and adding bats tests to cover the checks.
I'd try to keep the names of things as close as possible to what is listed on agent/config/config-files, try to avoid double using the same term (ie metrics) and avoid 'agent' where possible. The naming of things is difficult because it is hard to know if we should follow Consul config names or follow the patterns that are already in the helm chart. 🤷
d8aa8a0
to
1632e1d
Compare
5988346
to
7fa97cc
Compare
b9cfe7c
to
13c6cc2
Compare
…n with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation, deployment override failsafes
…n with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation | final initial-push
…e_debug) and telemetry.config update | enable_debug to server.config
…per template function as precheck
13c6cc2
to
85abcc8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
* datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation, deployment override failsafes * datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation | final initial-push * changelog entry update * datadog-integration: updated consul-server agent server.config (enable_debug) and telemetry.config update | enable_debug to server.config * curt pr review changes (minus extraConfig templating verification changes) * global.metrics.AgentMetrics -> global.metrics.enableAgentMetrics * dogstatsd and otlp mutually exclusive verification checks * breaking changes now incorporated into consul.validateExtraConfig helper template function as precheck * extraConfig hash updates post merge conflict update * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update changelog .txt to match new PR number * updated server-statefulset.yaml to correct ad.datadoghq.com/consul.logs annotation to valid single quote string * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update UDP dogstatsdPort behavior to exclude including a port value if using a kube service address (as determined by user overrides) * update _helpers.tpl consul.ValidateDatadogConfiguration func to account for using 'https' as protocol => should fail * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * correct otlp protocol helpers.tpl check to lower-case the protocol to match the open-telemetry-deployment.yaml behavior * fix server-acl-init command_test.go for datadog token policy - datacenter should have been dc1 * add in server-statefulset bats test for extraConfig validation testing
* datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation, deployment override failsafes * datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation | final initial-push * changelog entry update * datadog-integration: updated consul-server agent server.config (enable_debug) and telemetry.config update | enable_debug to server.config * curt pr review changes (minus extraConfig templating verification changes) * global.metrics.AgentMetrics -> global.metrics.enableAgentMetrics * dogstatsd and otlp mutually exclusive verification checks * breaking changes now incorporated into consul.validateExtraConfig helper template function as precheck * extraConfig hash updates post merge conflict update * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update changelog .txt to match new PR number * updated server-statefulset.yaml to correct ad.datadoghq.com/consul.logs annotation to valid single quote string * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update UDP dogstatsdPort behavior to exclude including a port value if using a kube service address (as determined by user overrides) * update _helpers.tpl consul.ValidateDatadogConfiguration func to account for using 'https' as protocol => should fail * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * correct otlp protocol helpers.tpl check to lower-case the protocol to match the open-telemetry-deployment.yaml behavior * fix server-acl-init command_test.go for datadog token policy - datacenter should have been dc1 * add in server-statefulset bats test for extraConfig validation testing
* Datadog Integration (#3407) * datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation, deployment override failsafes * datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation | final initial-push * changelog entry update * datadog-integration: updated consul-server agent server.config (enable_debug) and telemetry.config update | enable_debug to server.config * curt pr review changes (minus extraConfig templating verification changes) * global.metrics.AgentMetrics -> global.metrics.enableAgentMetrics * dogstatsd and otlp mutually exclusive verification checks * breaking changes now incorporated into consul.validateExtraConfig helper template function as precheck * extraConfig hash updates post merge conflict update * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update changelog .txt to match new PR number * updated server-statefulset.yaml to correct ad.datadoghq.com/consul.logs annotation to valid single quote string * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update UDP dogstatsdPort behavior to exclude including a port value if using a kube service address (as determined by user overrides) * update _helpers.tpl consul.ValidateDatadogConfiguration func to account for using 'https' as protocol => should fail * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * correct otlp protocol helpers.tpl check to lower-case the protocol to match the open-telemetry-deployment.yaml behavior * fix server-acl-init command_test.go for datadog token policy - datacenter should have been dc1 * add in server-statefulset bats test for extraConfig validation testing * manual cherry-pick failed checks fix * revert leave_on_terminate and autopilot updates from commit #3000 and re-apply datadog-integration branch changes
* datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation, deployment override failsafes * datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation | final initial-push * changelog entry update * datadog-integration: updated consul-server agent server.config (enable_debug) and telemetry.config update | enable_debug to server.config * curt pr review changes (minus extraConfig templating verification changes) * global.metrics.AgentMetrics -> global.metrics.enableAgentMetrics * dogstatsd and otlp mutually exclusive verification checks * breaking changes now incorporated into consul.validateExtraConfig helper template function as precheck * extraConfig hash updates post merge conflict update * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update changelog .txt to match new PR number * updated server-statefulset.yaml to correct ad.datadoghq.com/consul.logs annotation to valid single quote string * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update UDP dogstatsdPort behavior to exclude including a port value if using a kube service address (as determined by user overrides) * update _helpers.tpl consul.ValidateDatadogConfiguration func to account for using 'https' as protocol => should fail * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * correct otlp protocol helpers.tpl check to lower-case the protocol to match the open-telemetry-deployment.yaml behavior * fix server-acl-init command_test.go for datadog token policy - datacenter should have been dc1 * add in server-statefulset bats test for extraConfig validation testing
…ycleShutdown… into release/1.4.x (#4007) * Fix meshgw tests (#3532) * Fix meshgw tests * change protocol on mesh gw tests to tcp from mesh * add nightly for rc branch (#3533) * [NET-7243] Stub APIGateway Controller for v2 (#3507) * stub api-gateway-controller * Add setup to v2 controller * Net 7376 Status struct on api gateway with required info from kubesig (#3530) * add status structs * update status * updated script to point at RC version correctly (#3541) * updated script to point at RC version correctly * Mw/prepare main for 1.5 dev (#3535) * bump versions to next version * updated script to handle new Consul-k8s images * [COMPLIANCE] Add Copyright and License Headers (#3499) Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> * Net 7279 consul k8s write failing acceptance test for tcp route (#3540) * add status structs * update status * fixtures for v2 * checkpoint * add hook to only run test when flag is enabled * clean up reversions, delte extra files * remove http listeners * delete extra file * revert accidental IDE changes * clean up lint issues * Add json tags to api-gateway types (#3550) * reconcile consul-k8s with changes made in Consul (#3543) * [NET-7656] Add GatewayClassConfig watch for MeshGateway controller (#3537) * Add GatewayClass[Config] watches for MeshGateway controller * Update merge logic for deployment + service * Add test coverage for MergeDeployment * Add test coverage for MergeService * Copy over owner references to new Service + Deployment * Ensure signals are passed to commands (#3548) * Ensure signals are passed to commands Change `/bin/sh -ec "<command>"` to `/bin/sh -ec "exec <command>"`. Adding `exec` ensures that `<command>` is not executed as a child process but replaces the `/bin/sh` process. This ensure that `<command>` receives any signals. Specifically this is an issue when attempting to trap SIGTERMs as part of graceful pod shutdown. Without this change, we weren't receiving any signals because they aren't passed down by `/bin/sh -c`. * Fix broken bats tests and add changelog Signed-off-by: Ashwin Venkatesh <ashwin.what@gmail.com> --------- Signed-off-by: Ashwin Venkatesh <ashwin.what@gmail.com> Co-authored-by: Ashwin Venkatesh <ashwin.what@gmail.com> * [NET-7158] CRUD hooks for api gateway v2 (#3519) * Add hooks for CRUD side effects for apigateway controller * Added tests for controller * [NET-6465] Respect connectInject.initContainer.resources for v1 API gateways (#3531) * Respect connectInject.initContainer.resources for v1 API gateways * Add changelog entry * Add test coverage for init container resources on API gateway Pods * Add NET_BIND_SERVICE to the security context in the deployment of Mesh Gateway (NET-6463) (#3549) * Add NET_BIND_SERVICE to the security context in the deployment of Mesh Gateway * [NET-7657,NET-6934] Define v2 GatewayClass + GatewayClassConfig locally (#3559) * Define GatewayClass's spec model locally instead of consuming proto from Consul * Update gateway resources job to use new types, constants * Make description optional, regenerate CRD definitions * Remove GatewayClass columns related to syncing into Consul * [NET-7156] Gateways Controllers Reusability (#3574) * make controller setup for gateway controllers generic and reusable, add indices onto gateway resources in k8s for more efficient lookups * cleanup from PR review * Update control-plane/controllers/resources/gateway_controller_setup.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Update control-plane/controllers/resources/gateway_indices.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Update control-plane/controllers/resources/gateway_controller_setup.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * Update control-plane/controllers/resources/gateway_controller_setup.go Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * clean up from PR review --------- Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> * [NET-6465] Consider init container resources when determining if existing + desired deployments are equal (#3575) * Consider init container resources when determining if existing + desired deployments are equal * Add test coverage for compareDeployments * Update control-plane/api-gateway/gatekeeper/deployment_test.go * [NET-7657] Consume version of proto-public with GatewayClass[Config] removed (#3581) [NET-7657] Consume version of proto-public with GatewayClass + GatewayClassConfig removed * Update multicluster v2beta1 to v2 (#3560) Co-authored-by: skpratt <sarah.pratt@hashicorp.com> * [NET-7156] Generalize MeshGatewayBuilder to just GatewayBuilder (#3538) * update gateway builder to be generic * Add api gateway to gateway builder * Updated service test for gateway listeners/ports * update test names * update listener functions * remove check for listener name * fix tests * release: Update 10-util.sh to adjust formatting (#3588) Update 10-util.sh * use go 1.21.7 (#3591) * add make target script (#3596) add new make target for go mod tidy check * v2tenancy: namespace mirroring acceptance tests (#3590) * add linting back (#3603) added linting back * [COMPLIANCE] Add Copyright and License Headers (#3610) Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> * Datadog Integration (#3407) * datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation, deployment override failsafes * datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation | final initial-push * changelog entry update * datadog-integration: updated consul-server agent server.config (enable_debug) and telemetry.config update | enable_debug to server.config * curt pr review changes (minus extraConfig templating verification changes) * global.metrics.AgentMetrics -> global.metrics.enableAgentMetrics * dogstatsd and otlp mutually exclusive verification checks * breaking changes now incorporated into consul.validateExtraConfig helper template function as precheck * extraConfig hash updates post merge conflict update * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update changelog .txt to match new PR number * updated server-statefulset.yaml to correct ad.datadoghq.com/consul.logs annotation to valid single quote string * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets * update UDP dogstatsdPort behavior to exclude including a port value if using a kube service address (as determined by user overrides) * update _helpers.tpl consul.ValidateDatadogConfiguration func to account for using 'https' as protocol => should fail * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul * correct otlp protocol helpers.tpl check to lower-case the protocol to match the open-telemetry-deployment.yaml behavior * fix server-acl-init command_test.go for datadog token policy - datacenter should have been dc1 * add in server-statefulset bats test for extraConfig validation testing * Net 7238 - consul k8s modify gateway resources job to create apigw gatewayclass and gatewayclassconfig (#3564) * configmap update * udpate chart to respect api-gateway-config * fix typo * added unit tests, added some stuff missed in initial pass * added thorough unit tests for gateway-resources-configmap.yaml * remove unneeded extra line * additional debugging * test * test * remove extra escapes * final test * test again * one more test * this should work * fix spacing issue * Fix logic on apigateway that ignores current annotations on services (#3597) * [NET-7449] Generalize CRUD hooks for Gateways (#3576) Generalize the crud hooks for gateways * [NET-5932] chore: remove comment from closed ticket (#3636) chore: remove comment from closed ticket * [NET-2420] security: Upgrade helm containerd and several other dependencies (#3625) * security: upgrade helm/v3 to 3.13.3 Addresses multiple CVEs: - CVE-2023-25165 - CVE-2022-23524 - CVE-2022-23526 - CVE-2022-23525 * chore: upgrade k8s dependencies to match controller-runtime * security: upgrade containerd to latest Addresses GHSA-7ww5-4wqc-m92c (GO-2023-2412) * security: upgrade docker/docker to latest Addresses GHSA-jq35-85cj-fj4p * security: upgrade docker/distribution to latest Addresses CVE-2023-2253 * security: upgrade filepath-securejoin to latest patch Addresses GHSA-6xv5-86q9-7xr8 (GO-2023-2048) * chore: upgrade oras-go to fix docker incompatibility * Add changelog * build: Create arm64 packages as well (#3428) During the CRT on-boarding, packaging for other Linux architectures (arm64) was not enabled. This change adds packaging support for those architectures. I've specifically opted not to include 32-bit. See #1132. Related to hashicorp/releng-support#178. Other related updates: - To make future support a bit easier, I've enabled the build workflow from releng prefixed branches. - Using qemu emulation for testing package installs on other architectures, thus allowing us to validate the binaries work as intended - Minor alteration to the package install tests to use yum instead of rpm Co-authored-by: David Yu <dyu@hashicorp.com> * [NET-2420] security: re-enable security scan release block (#3628) * security: upgrade helm/v3 to 3.13.3 Addresses multiple CVEs: - CVE-2023-25165 - CVE-2022-23524 - CVE-2022-23526 - CVE-2022-23525 * chore: upgrade k8s dependencies to match controller-runtime * security: upgrade containerd to latest Addresses GHSA-7ww5-4wqc-m92c (GO-2023-2412) * security: upgrade docker/docker to latest Addresses GHSA-jq35-85cj-fj4p * security: upgrade docker/distribution to latest Addresses CVE-2023-2253 * security: upgrade filepath-securejoin to latest patch Addresses GHSA-6xv5-86q9-7xr8 (GO-2023-2048) * chore: upgrade oras-go to fix docker incompatibility * Add changelog * security: re-enable security scan release block This was previously disabled due to an unresolved false-positive CVE. Re-enabling both secrets and OSV + Go Modules scanning, which per our current scan results should not be a blocker to future releases. Also add security scans on PR and merge to protected branches to allow proactive triage going forward. See hashicorp/consul#19978 for similar change in that repo, adapted here. * [NET-8174] security: add scan triage for CVE-2024-25620 (helm/v3) (#3657) security: add scan triage for CVE-2024-25620 (helm/v3) Triage this scan result as `consul-k8s` should not be directly impacted and it is medium severity. Follow-up ticket filed for remediation. Also improve formatting of scan config since this change will be backported. * Update main changelog for 1.1.10, 1.2.6 and 1.3.3 (#3662) * Update main changelog for 1.1.10, 1.2.6 and 1.3.3 * include previous missed releases * [COMPLIANCE] Add Copyright and License Headers (#3654) Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> * [NET-7450] setup crud hooks for APIGateway v2 (#3580) * setup crud hooks for APIGateway v2 * update CRDS and reorganize code in api gateway type * pass in gateway kind for annotations * Fix tests * Fix tests * register all types needed for test * values.yaml - tlsServerName docs (#3656) * Update values.yaml Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> --------- Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> * [NET-6741] make: Add target for updating dependencies across all modules (#3669) make: Add target for updating dependencies across all modules To enable more consistent and error-proof dependency management, add a Make target that will set a dependency version across all submodules that require it. Also runs `go mod tidy`. This first ensures the dependency addition is reverted if the module in question does not require it; it also ensures that any additional cleanup needed in `go.mod`/`go.sum` is applied. * build.yml: Add ECR images back (#3668) * Update build.yml * Create 3668.txt * build.yml: typo on tags (#3681) * bump kind to v0.22.0 and update k8s support (#3675) * bump kind to v0.22.0 and update k8s support * Create 3675.txt * Update README.md * [NET-8174] security: add scan triage for CVE-2024-26147 (helm/v3) (#3688) security: add scan triage for CVE-2024-26147 (helm/v3) * chore: upgrade Consul dependencies to latest (#3695) * chore: upgrade Consul dependencies to latest * chore: upgrade control-plane submodule dependencies to latest * fix: update GatewayClass finalizer reference * release: add \n to end of NOTE for releases (#3700) * Update 10-util.sh * Update control-plane/build-support/functions/10-util.sh Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com> --------- Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com> * chore: upgrade `consul/api` to latest (#3702) chore: upgrade consul/api to latest v1.28.0 was retracted due to double-publish. * [NET-8174] security: add triage alias for GO-2024-2554 (#3705) security: add triage alias for GO-2024-2554 This vulnerability was already triaged via its GHSA alias, but the scanner is flagging it under this name, so adding an explicit entry. * docs: update `CHANGELOG` for K8s 1.4.0 release (#3710) docs: update CHANGELOG for K8s 1.4.0 release * docs: update 1.4.0 Helm docs per Docs team feedback (#3714) * [NET-8367] security: upgrade google.golang.org/protobuf to 1.33.0 (#3719) * update protobuf lib * add changelog * NET-6878: Fix Flake API Gateway Acceptance (#3717) * test upgraded library * remove toolchain reference * add toolchain * NET-8391: fix cleanup script (#3725) * NET-8391: fix cleanup script * cleanup testing comments * NET-8391: fix cleanup script - remove network interface(s) (#3730) * cleanup network interfaces * clean up test * updates k8s version (#3731) * fix(control-plane): acl tokens deleted while pods in graceful shutdown (#3736) * NET-6878: Remove finalizers from CRDs during test resource cleanup (#3739) * remove finalizers from crds * add comments * Upgrade to go 1.21.8 (#3741) * Upgrade to use Go `1.21.8`. This resolves CVEs [CVE-2024-24783](https://nvd.nist.gov/vuln/detail/CVE-2024-24783) (`crypto/x509`). [CVE-2023-45290](https://nvd.nist.gov/vuln/detail/CVE-2023-45290) (`net/http`). [CVE-2023-45289](https://nvd.nist.gov/vuln/detail/CVE-2023-45289) (`net/http`, `net/http/cookiejar`). [CVE-2024-24785](https://nvd.nist.gov/vuln/detail/CVE-2024-24785) (`html/template`). [CVE-2024-24784](https://nvd.nist.gov/vuln/detail/CVE-2024-24784) (`net/mail`). Update the Consul Build Go base image to `alpine3.19`. This resolves CVEs [CVE-2023-52425](https://nvd.nist.gov/vuln/detail/CVE-2023-52425) [CVE-2023-52426](https://nvd.nist.gov/vuln/detail/CVE-2023-52426) * Add changelog * Fix typo in values file for sync catalog test (#3760) * upgraded helm v3 to address GHSA-jw44-4f3j-q396 (#3768) * disable scan for "GHSA-jw44-4f3j-q396" until patch fix in helm v3 * addressed comments * Net 6821 - Regenerate Terminating Gateway CRD with new field (#3737) * initial updates * regen crds * Add fixes for flaky-cni and failing cloud-nightly tests (#3764) Add fixes for flaky-cni * Catalog: Use EndpointSlice and propagate Kubernetes Topology information to synced consul service (#3693) * Use EndpointSlice and propagate zone metadata to consul service * Fix tests * Add test for zone metadata * Cleanup and changelog entry * Fix clusterrole permissions and type on Informer * Include region info for NodePort services * Include topology region for all service types * Update release note * Fix tests * fix sync-catalog-clusterrole and tests * fix stash conflict * adding endpoints permission back to sync catalog since it still uses it. * Fix endpointslice map * Fix topology region * Remove region lookups, remove endpoints permissions, use pointers for endpointslice map * Drop region test --------- Co-authored-by: John Murret <john.murret@hashicorp.com> * Increase timeout for running commands in acceptance test (#3784) increase timeout for running commands * Bugfix: Don't recreate servicemap for catalog sync (#3785) * test: fix TestConnectInject_ProxyLifecycleShutdown (#3774) * Removes Legacy API Gateway Stanza that was deprecated in Consul 1.16 (#3718) * Removes Legacy API Gateway Stanza that was deprecated in Consul 1.16 * remove unit test for previously removed `consul-cni` validation (#3794) In #1527, we added support for OpenShift and Multus, which meant that the `consul-cni` plugin was no longer necessarily the final CNI plugin run. While working on a patch to allow compatibility with Nomad transparent proxy, I discovered we'd never removed a now-failing unit test of the plugin for the validation step. It looks like the remaining unit tests still cover the remaining validation, so we can safely remove this test. Ref: #1527 Ref: hashicorp/nomad#10628 * [NET-8412] Fix order of APIGW ACL policy/role creation (#3779) * Reorder gateway policy and role creation to avoid error messages in consul when policy/role already exists * refactor for readability * fix spacing * Added changelog * improve reliability of acceptance tests (#3800) * improve reliability of acceptance tests * remove update to timeout * add output to error * [net-8411] bug: fix premature token and service instance deletion due to pod fetch errors (#3758) * API gateway metrics (#3811) * First metrics pass * Fix up build * move to non-deprecated chart options * Fix up charts and defaults * Add changelog * Fix bad merge * Fix test * fix linter error * Fix extra yaml block from bad merge * Switch == true check to use ParseBool * Add support for Nomad transparent proxy (#3795) Nomad will implement support for Connect transparent proxy. Unlike in K8s, the CNI plugin can't contact the Nomad API to read allocation metadata (pod labels) to get the iptables configuration, and doesn't use the rest of the Consul-K8s control plane to inject that metadata. Instead, Nomad will pass the iptables configuration JSON-serialized in the CNI arguments. This changeset implements the behavior switch by detecting the `CONSUL_IPTABLES_CONFIG` argument in the CNI arguments. This hypothetically allows for non-Nomad workflows to use the same code path, if desired. Ref: hashicorp/nomad#10628 * fix version output for `consul-cni` (#3829) The `consul-cni` plugin emits "version unknown" because the CNI library's `PluginMain` uses a global variable that isn't being set as part of our build process. Import the `control-plane/version` package so that we have an identical version in builds across both binaries. * [NET-8601] Upgrade `vault/api` and `docker/docker` to resolve open CVEs (#3837) * security: upgrade vault/api to remove go-jose.v2 * security: upgrade docker/docker to v25.0.5 * add changelog * Remove anyuid SCC requirement for OpenShift (#3813) Remove SCC requirement for anyuid for OpenShift * Cleanup formatting to follow consul-k8s standard (#3852) * Datadog Unix Socket Path Custom Path fix (#3635) * Update dogstatsd hostPath rendering for Unix domain sockets -- override customizable and volumeMount/volume should align * changelog update * changelog: reviewer update to include datadog specific context * readd dev image tags for fips ubi (#3881) * readd dev image tags for fips ubi * fix up bad copy paste * [net-7710] don't overwrite prometheus path annotation if it's already been specified (#3846) don't overwrite prometheus path annotation if it's already been specified * feat: Add startup-grace-period-seconds and graceful-startup-path (#3878) * feat: Add startup-grace-period-seconds and graceful-startup-path * Add changelog --------- Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com> * NET-8594: Disable TestSyncCatalog (#3815) * [NET-8946 NET-8947 NET-8948] security: bump go, x/net and envoy versions (#3893) security: bump go and x/net * NET-8594: Disable TestSyncCatalogIngress (#3904) * Helm: support sync-lb-services-endpoints for sync catalog (#3905) * Helm: support sync-lb-services-endpoints for sync catalog * add test * fix template tag order --------- Co-authored-by: jukie <10012479+Jukie@users.noreply.github.com> * Datadog Integration Acceptance Tests / Bug fixes (#3685) * datadog: acceptance tests - initial commit (not fully working yet) * server-statefulset: update logic for prometheus annotations (only enabled if using dogstatsd, otherwise disabled) * datadog: acceptance test working with dd-client api and operator deployment frameword * datadog-acceptance: main branch rebase merge conflict cherry-pick * datadog: acceptance testing update to metric name matching using regex * datadog: acceptance testing helper update for backoff retry * datadog: acceptance testing working timeseries query verification udp + uds * datadog: update helpers for /v1/query * server-statefulset.yaml: update to correct release name prepend to consul-server URL * datadog: acceptance testing consul integration checks working * server-statefulset: yaml and bats updates for datadog openmetrics and consul integration check URLs to use consul.fullname-server * PR3685: changelog update * datadog: openmetrics acceptance test update * datadog: added OTEL_EXPORTER_OTLP_ENDPOINT to consul telemetry collector deployment for dd-agent ingestion (passes tag info to DD) * otlp: datadog otlp acceptance test updates for telemetry-collector (grpc => http prefix) | staged otlp acceptance test * datadog-acceptance: fake-intake fixture addition * datadog-acceptance: update _helpers.tpl for consul version sanitization (truncate to <64) * datadog-acceptance: update base fixture for fake-intake * datadog-acceptance: add DogstatsD stats enablement (required for curling agent local endpoint) * datadog-acceptance: add DogstatsD stats enablement (required for curling agent local endpoint) * datadog-acceptance: first-round fake-intake testing - works but is innaccurate * datadog-acceptance: datadog framework - remove dd client agent requirement (fake-intake) * datadog-acceptance: update flags to not require API and APP key (fake-intake) * datadog-acceptance: go mod updates for uuid downgrade * acceptance-test: remove otlp acceptance test -- no fake-intake or agent endpoint to verify * datadog-acceptance: acceptance test lint fixes * acceptance-test: update control-plane/cni/main.go l:272 comment with period for lint testing. * acceptance-test: retry lint fixes * acceptance-test: correct telemetry collector URL from grpc:// to http:// * [NET-8412] Fix APIGW policy creation ordering for upgrade path (#3918) * fix policy creation for upgrading * Added changelog * Add post-release changelogs (#3867) Add changelogs * GH-3406 - Only error for config entries from different datacenters when the config entries are different (#3873) * GH-3406 - Only error for config entries from different datacenters when the config entries are different * add changelog * fixing tests and logic * refactoring code to make tests pass and also use a switch statement for readability and also get rid of intermediate state flag of requireMigration in a long iterative section of code. * add missing license file (#3921) * add missing license file * missed copying the license file to workdir * make up missing value and remove redundant directory creation * [COMPLIANCE] Add Copyright and License Headers (#3936) Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> * Net 9069/xw add license file to all bin (#3942) * debug: missing LICENSE * use abs path * [NET-6466] Remove secrets from termgw role (#3928) * remove unnecessary permissions for terminating gateways * add changelog * Net 9069/fix local brokerage (#3948) * make copy of license file into control plane * remove redundant copy in gh workflow * use env instead of arg * [NET-8091] Use file-system-certificate in Consul instead of inline-certificate (#3767) * Use file-system-certificate in Consul instead of inline-certificate * Actually update correctly from merges * Adds changelog * Updates go.mod in acceptance tests with latest consul api, updates the acceptance gateway lifecycle test * Small updates * Update comment --------- Co-authored-by: Melisa Griffin <melisa.griffin@hashicorp.com> * chore: remove workstream from JIRA sync (#3960) * NET-9154: Update Kubernetes version (#3958) Update Kubernetes version * chore: fix JIRA workflow (#3965) * [NET-9097, NET-8174] Upgrade controller-runtime (#3935) * Consume controller-runtime v0.16.3 This is the version required by gateway-api v1.0.0, which will be consumed in a future PR * Reconcile breaking changes in controller-runtime * Fix linter errors * gofmt * Update controller tests to handle new fake client requirements * Update test assertion to handle changes in controller-runtime * Restore incorrectly-removed flags * Use a proper delete on the fake client since DeletionTimestamp is immutable * Update enterprise tests to specify status subresources * Update controller-runtime dependency for acceptance tests * Explicitly inject decoder into webhooks * Appease the linter * Use SetupWithManager pattern from controllers for webhook setup * Consume consistent version of k8s.io/client-go everywhere * Upgrade related dependencies for CLI, including helm/v3 * Consume latest release of helm/v3 * changelog * Inline function calls for testing * Consume controller-runtime v0.16.5 --------- Co-authored-by: Ronald Ekambi <ronekambi@gmail.com> * Fix a panic in connect-inject when the provided upstreams list is malformed (#3956) * Check if an upstream is malformed, if so ignore it. * support multiple upstreams separator (<space>, <comma>) add tests * add /n as a separator * add changelog * added log when upstream is skipped * [NET-9152] CRD for service registeration (#3943) * service is registering * add all the fields * health checks working * handle finalizers to clean up * Add status to registration CRD * Added initial unit test for reconcile * success paths for registration and deregistration * added failure tests, moved finalizer removal logic so it occurs after service is successfully deregistered * first test for to catalog registration type * maximal registration to catalog test * test all the things * deregistration tests * update some comments and fields, re-run generators * Added changelog * linting all the things * fixing test setup for new controller runtime * Handle errors for parsing duration * Add ReadOnlyRootFilesystem to Security Context (#2909) * Add readOnlyRootFilesystem to security context (#2771) * readOnlyRootFilesystem * Add mount for /tmp * Add /tmp mountpoint * Update ingress-gateways-deployment.yaml * Update terminating-gateways-deployment.yaml * Update helm unit tests * Create 2781.txt * rename changelog file * rename changelog file * Mount /tmp to volume for snapshots * rename changelog * changelog --------- Co-authored-by: mr-miles <miles.waller@gmail.com> Co-authored-by: Paul Glass <pglass@hashicorp.com> Co-authored-by: Sarah Alsmiller <sarah.alsmiller@hashicorp.com> * activate tproxy mode even when a cluster IP is not assigned to pod (#3974) * activate tproxy mode even when a cluster IP is not assigned to pod. * add changelog * fix failing tests * security: Upgrade Go to 1.21.10 (#3980) * NET-9178-Consul-api-gateway-not-starting-after-restart (#3978) * don't error if role already exists on restart * changelog * lint * [NET-9153] Handle Terminating Gateway ACL Setup (#3975) * first pass at creating write policy for service and updating term gw acl role * handle deregistering, update tests for registering with acls * existing deregister tests passing * failures with term gw role not existing * clean up * reorg code * Move to own package * watch for terminating gateways * move files back, handle multiple terminating gateways * handle errors and ensure finalizer is set * Add tests for finalizers * remove unused file * fix import naming * linting * fix comment, extract constant * [NET-9201] Validating webhook for registrations (#3990) * Add validating webhook for registrations * cleaned up registration webhook setup * fix setup for webhook, updated docs * fix typo, remove debugging log, rename variables for readability * Updating GitHub action versions to the latest TSCCR approved version (#3979) * test: fix PeeringGateway acceptance (#3992) * Adds ability to set the imagePullPolicy for all Consul images (consul… (#3991) * Adds ability to set the imagePullPolicy for all Consul images (consul, consul-dataplane, consul-k8s, consul-telemetry-collector) * [NET-9155] Cache resources for Registrations (#3993) * Add set for adding and removing services * remove service add * first pass at populating cache * cache is working, need to fix how statuses are handled * move to new directory, fix up the status conditions (still todos on this), handle results * updated tests * unexport methods that don't need to be exported * handle consul deregistrations * clean up before code review * show ACLUpdate as false if consul deregistered service * fix issue with updating acl status on consul deregistration * fix linting errors * FLAKEY_TEST: Add retry to outbound request for ProxyLifecycleShutdownTest * increase retry count for TestAPIGateway_GatewayClassConfig test * backport of commit b7ecab4 * backport of commit 2fcccd2 --------- Signed-off-by: Ashwin Venkatesh <ashwin.what@gmail.com> Co-authored-by: John Maguire <john.maguire@hashicorp.com> Co-authored-by: Michael Wilkerson <62034708+wilkermichael@users.noreply.github.com> Co-authored-by: sarahalsmiller <100602640+sarahalsmiller@users.noreply.github.com> Co-authored-by: hashicorp-copywrite[bot] <110428419+hashicorp-copywrite[bot]@users.noreply.github.com> Co-authored-by: Anita Akaeze <anita.akaeze@hashicorp.com> Co-authored-by: Nathan Coleman <nathan.coleman@hashicorp.com> Co-authored-by: Luke Kysow <1034429+lkysow@users.noreply.github.com> Co-authored-by: Ashwin Venkatesh <ashwin.what@gmail.com> Co-authored-by: Melisa Griffin <missylbytes@users.noreply.github.com> Co-authored-by: Chris S. Kim <ckim@hashicorp.com> Co-authored-by: skpratt <sarah.pratt@hashicorp.com> Co-authored-by: David Yu <dyu@hashicorp.com> Co-authored-by: Semir Patel <semir.patel@hashicorp.com> Co-authored-by: natemollica-dev <57850649+natemollica-nm@users.noreply.github.com> Co-authored-by: Michael Zalimeni <michael.zalimeni@hashicorp.com> Co-authored-by: Daniel Kimsey <90741+dekimsey@users.noreply.github.com> Co-authored-by: Curt Bushko <cbushko@gmail.com> Co-authored-by: Jeff Boruszak <104028618+boruszak@users.noreply.github.com> Co-authored-by: NicoletaPopoviciu <87660255+NicoletaPopoviciu@users.noreply.github.com> Co-authored-by: Dan Stough <dan.stough@hashicorp.com> Co-authored-by: Ashwin Venkatesh <ashwin@hashicorp.com> Co-authored-by: Isaac Wilson <10012479+jukie@users.noreply.github.com> Co-authored-by: John Murret <john.murret@hashicorp.com> Co-authored-by: Tim Gross <tgross@hashicorp.com> Co-authored-by: Nitya Dhanushkodi <nitya@hashicorp.com> Co-authored-by: Andrew Stucki <andrew.stucki@hashicorp.com> Co-authored-by: Alvin Huang <17609145+alvin-huang@users.noreply.github.com> Co-authored-by: Andrea Scarpino <andrea@scarpino.dev> Co-authored-by: Deniz Onur Duzgun <59659739+dduzgun-security@users.noreply.github.com> Co-authored-by: wangxinyi7 <xinyi.wang@hashicorp.com> Co-authored-by: Melisa Griffin <melisa.griffin@hashicorp.com> Co-authored-by: Ronald Ekambi <ronekambi@gmail.com> Co-authored-by: Dhia Ayachi <dhia@hashicorp.com> Co-authored-by: mr-miles <miles.waller@gmail.com> Co-authored-by: Paul Glass <pglass@hashicorp.com> Co-authored-by: Sarah Alsmiller <sarah.alsmiller@hashicorp.com>
Changes proposed in this PR
enable_debug
telemetry.disable_hostname
telemetry.enable_host_metrics
telemetry.prefix_filter
telemetry.dogstatsd_addr
telemetry.dogstatsd_tags
/v1/agent/metrics?format=prometheus
endpoint/v1/agent/metrics?format=prometheus
/v1/agent/self
/v1/status/leader
/v1/status/peers
/v1/catalog/services
/v1/health/service
/v1/health/state/any
/v1/coordinate/datacenters
/v1/coordinate/nodes
server-acl-init
token creation for OpenMetrics and Datadog Consul Integration check methods allowing default minimal acl token permission generation for Datadog agent usage as necessary.How I've tested this PR
CONTRIBUTING.md
steps.consul-dev
(main) andconsul-k8s-control-plane-dev
(datadog-integration branch) images on k3d test cluster for each scenario. Test repository here.CONTRIBUTING.md
steps.bats ./charts/consul/test/unit --jobs 8
- ran successfully for all tests.How I expect reviewers to test this PR
Checklist