Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create istio exclusion for CSI Driver in case of codeModules or public registry #3343

Closed
wants to merge 22 commits into from

Conversation

waodim
Copy link
Contributor

@waodim waodim commented Jun 20, 2024

Description

With this the CSI Driver is excluded from istio blocking traffic if:

  • codeModulesImage is set
  • public registry is enabled

VirtualService and ServiceEntry is created that points to the host where the image is stored. If this is not the case the injection into a pod would not work as everything got stuck because the CSI Driver could not be able to retrieve the image.

I placed the reconciliation of the istioReconciler right before the injection reconciliation, please give me feedback if you are fine with this.
I placed the reconciliation after the codeModules versionReconciler is finished.
Currently I am only checking if the ImageID is set in the status of the CodeModule, this should cover both cases (codeModulesImage set or public registry enabled).

Respective ticket: https://dt-rnd.atlassian.net/browse/K8S-9622

How can this be tested?

Deploy the operator with istio, do not forget to label the to be injected namespace accordingly (istio-injection:enabled)
Apply a dynakube with istio enabled and

  • with a codeModulesImage set
  • with public registry feature flag enabled

Also check for VirtualServices and ServiceEntries and look if they are created correctly.

@waodim waodim added the core Changes to core functionality of the Operator label Jun 20, 2024
@waodim waodim requested a review from a team as a code owner June 20, 2024 10:49
@waodim waodim marked this pull request as draft June 20, 2024 10:55
pkg/controllers/dynakube/controller.go Outdated Show resolved Hide resolved
pkg/controllers/dynakube/istio/reconciler.go Outdated Show resolved Hide resolved
pkg/controllers/dynakube/istio/reconciler.go Outdated Show resolved Hide resolved
@waodim waodim marked this pull request as ready for review June 20, 2024 12:26
waodim and others added 2 commits June 24, 2024 14:50
Co-authored-by: Gabriel Krenn <gabriel.krenn@gmail.com>
@codecov-commenter
Copy link

codecov-commenter commented Jun 24, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 49.41176% with 43 lines in your changes missing coverage. Please review.

Project coverage is 57.32%. Comparing base (32dbd89) to head (6bfb710).
Report is 4 commits behind head on main.

Files Patch % Lines
pkg/controllers/dynakube/istio/reconciler.go 53.24% 24 Missing and 12 partials ⚠️
...mocks/pkg/controllers/dynakube/istio/reconciler.go 0.00% 7 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3343      +/-   ##
==========================================
- Coverage   57.33%   57.32%   -0.01%     
==========================================
  Files         345      345              
  Lines       19740    19865     +125     
==========================================
+ Hits        11317    11387      +70     
- Misses       7188     7230      +42     
- Partials     1235     1248      +13     
Flag Coverage Δ
unittests 57.32% <49.41%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pkg/controllers/dynakube/istio/reconciler_test.go Outdated Show resolved Hide resolved
pkg/controllers/dynakube/istio/reconciler_test.go Outdated Show resolved Hide resolved
pkg/controllers/dynakube/istio/reconciler.go Outdated Show resolved Hide resolved
waodim and others added 2 commits July 1, 2024 08:59
Co-authored-by: Marcell Sevcsik <31651557+0sewa0@users.noreply.github.com>
@@ -450,6 +508,55 @@ func TestReconcileActiveGateCommunicationHosts(t *testing.T) {
})
}

func TestParseCodeModulesImageURL(t *testing.T) {
Copy link
Contributor

@0sewa0 0sewa0 Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a special case that was missed:

  • when the image used is in dockerhub: codeModulesImage: dynatrace/dynatrace-codemodules:1.283.139.20240209-194956

the resulting ServiceEntry:

apiVersion: networking.istio.io/v1
kind: ServiceEntry
metadata:
  annotations: ...
  labels: ...
  name: test-fqdn-csi-driver
  namespace: dynatrace
  ownerReferences: ...
spec:
  hosts:
  - dynatrace
  ports:
  - name: https-443
    number: 443
    protocol: HTTPS
  resolution: DNS

which doesn't help the CSI-driver:

{"level":"info","ts":"2024-07-01T07:55:00.825Z","logger":"oneagent-image","msg":"failed to extract agent binaries from image via proxy","image":"dynatrace/dynatrace-codemodules:1.283.139.20240209-194956","imageCacheDir":"/data/cache","err":"getting image \"dynatrace/dynatrace-codemodules:1.283.139.20240209-194956\": Get \"https://index.docker.io/v2/\": read tcp 10.108.3.147:42910->3.219.239.5:443: read: connection reset by peer","errVerbose":"Get \"https://index.docker.io/v2/\": read tcp 10.108.3.147:42910->3.219.239.5:443: read: connection reset by peer\ngetting image \"dynatrace/dynatrace-codemodules:1.283.139.20240209-194956\""}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to figure out if the URL is incomplete because its referencing a dockerhub image?

I have no clue really 😬

How the image library does it: https://github.com/google/go-containerregistry/blob/main/pkg/name/repository.go#L80-L87

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice catch, did not consider it. I will have a look 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it now that way, this ONLY covers docker so I am not sure if we should go for it, pls let me know.
ea80b58

Copy link
Contributor

@0sewa0 0sewa0 Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still doesn't work: 😢 (probably need index.docker.io)

{"error":"getting image \"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559\": Get \"https://index.docker.io/v2/\": read tcp 10.12.1.11:42066-\u003e34.226.69.105:443: read: connection reset by peer","level":"info","logger":"oneagent-image","msg":"pullImageInfo","stacktrace":"Get \"https://index.docker.io/v2/\": read tcp 10.12.1.11:42066-\u003e34.226.69.105:443: read: connection reset by peer
provisioner getting image \"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559\"","ts":"2024-07-03T12:20:54.687Z"}
provisioner {"level":"info","ts":"2024-07-03T12:20:54.687Z","logger":"oneagent-image","msg":"failed to extract agent binaries from image via proxy","image":"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559","imageCacheDir":"/data/cache","err":"getting image \"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559\": Get \"https://index.docker.io/v2/\": read tcp 10.12.1.11:42066->34.226.69.105:443: read: connection reset by peer","errVerbose":"Get \"https://index.docker.io/v2/\": read tcp 10.12.1.11:42066->34.226.69.105:443: read: connection reset by peer\ngetting image \"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559\""}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provisioner {"level":"info","ts":"2024-07-03T12:46:47.397Z","logger":"oneagent-image","msg":"installing agent from image"}                                                                                         │
│ provisioner {"level":"info","ts":"2024-07-03T12:46:47.397Z","logger":"oneagent-image","msg":"installing agent","target dir":"/data/codemodules/ZHluYXRyYWNlL2R5bmF0cmFjZS1jb2RlbW9kdWxlczoxLjI5My4xMzMuMjAyNDA2MTg │
│ provisioner {"level":"info","ts":"2024-07-03T12:46:47.885Z","logger":"oneagent-image","msg":"pullOciImage","ref_identifier":"1.293.133.20240618-095559","ref.Name":"index.docker.io/dynatrace/dynatrace-codemodule │
│ provisioner {"level":"info","ts":"2024-07-03T12:46:57.345Z","logger":"oneagent-image","msg":"unpackOciImage","sourcePath":"/data/cache/ZHluYXRyYWNlL2R5bmF0cmFjZS1jb2RlbW9kdWxlczoxLjI5My4xMzMuMjAyNDA2MTgtMDk1NTU │
│ provisioner {"level":"info","ts":"2024-07-03T12:46:57.345Z","logger":"oneagent-zip","msg":"extracting tar gzip","source":"/data/cache/ZHluYXRyYWNlL2R5bmF0cmFjZS1jb2RlbW9kdWxlczoxLjI5My4xMzMuMjAyNDA2MTgtMDk1NTU5 │
│ provisioner {"level":"info","ts":"2024-07-03T12:47:07.814Z","logger":"oneagent-zip","msg":"moving unpacked archive to target","targetDir":"/data/codemodules/ZHluYXRyYWNlL2R5bmF0cmFjZS1jb2RlbW9kdWxlczoxLjI5My4xM │
│ provisioner {"level":"info","ts":"2024-07-03T12:47:07.815Z","logger":"oneagent-image","msg":"unpackOciImage","targetDir":"/data/codemodules/ZHluYXRyYWNlL2R5bmF0cmFjZS1jb2RlbW9kdWxlczoxLjI5My4xMzMuMjAyNDA2MTgtMD │
│ provisioner {"level":"info","ts":"2024-07-03T12:47:07.897Z","logger":"oneagent-symlink","msg":"found version","version":"1.293.133.20240618-095559"}                                       

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Is your csi-driver injected by istio?
  2. Is your istio configured to be in restricted mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After configuring my istio as you suggested:

istioctl install --set meshConfig.outboundTrafficPolicy.mode=REGISTRY_ONLY

I can confirm that the csi driver is injected with the istio containers.

dynatrace-oneagent-csi-driver-5htgs 5/5 Running 0 5m55s

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are going to love this (IMO, I don't even want to do anything like this, because this shows how unpredictable this whole scenario is)

It still does not work:

provisioner {"error":"getting image \"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559\": Get \"https://auth.docker.io/token?scope=repository%3Adynatrace%2Fdynatrace-codemodules%3Apull\u0026service=registry.docker.io\": read tcp 10.12.3.29:43420-\u003e54.196.99.49:443: read: connection reset by peer","level":"info","logger":"oneagent-image","msg":"pullImageInfo","stacktrace":"Get \"https://auth.docker.io/token?scope=repository%3Adynatrace%2Fdynatrace-codemodules%3Apull\u0026service=registry.docker.io\": read tcp 10.12.3.29:43420-\u003e54.196.99.49:443: read: connection reset by peer"}

so just add auth.docker.io, right?
nope, then you will get:

provisioner {"level":"info","ts":"2024-07-04T12:23:46.322Z","logger":"oneagent-image","msg":"saving v1.Image img as an OCI Image Layout at path","/data/cache":"Get \"https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/64/644c3f0771f461096ccb2c5ce4132447032b833b466cfd845c1a89fd4e5eb762/data?verify=1720098825-yv65IR1rXZ98%2FyHhfEAM%2FMx6cJY%3D\": read tcp 10.12.3.29:55420->104.16.99.215:443: read: connection reset by peer"}

so you have to add production.cloudflare.docker.com as well

Sidenote:

  • using a wildcard like *.docker.io or *.docker.com is not that simple (you have to mess with the DNS within your application if I understand it correctly)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in d23af95 although i do not like this solution at all tbh. Seems to constructed for this case...

@waodim waodim requested a review from 0sewa0 July 4, 2024 07:22
@waodim waodim closed this Jul 15, 2024
@waodim
Copy link
Contributor Author

waodim commented Jul 15, 2024

closed this PR for now, as the way it is done here is way too hacky. We will revisit this and rework.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Changes to core functionality of the Operator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants