-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create istio exclusion for CSI Driver in case of codeModules or public registry #3343
Conversation
…natrace-operator into feature/csi-istio-exclusion
Co-authored-by: Gabriel Krenn <gabriel.krenn@gmail.com>
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #3343 +/- ##
==========================================
- Coverage 57.33% 57.32% -0.01%
==========================================
Files 345 345
Lines 19740 19865 +125
==========================================
+ Hits 11317 11387 +70
- Misses 7188 7230 +42
- Partials 1235 1248 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Co-authored-by: Marcell Sevcsik <31651557+0sewa0@users.noreply.github.com>
@@ -450,6 +508,55 @@ func TestReconcileActiveGateCommunicationHosts(t *testing.T) { | |||
}) | |||
} | |||
|
|||
func TestParseCodeModulesImageURL(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a special case that was missed:
- when the image used is in dockerhub:
codeModulesImage: dynatrace/dynatrace-codemodules:1.283.139.20240209-194956
the resulting ServiceEntry
:
apiVersion: networking.istio.io/v1
kind: ServiceEntry
metadata:
annotations: ...
labels: ...
name: test-fqdn-csi-driver
namespace: dynatrace
ownerReferences: ...
spec:
hosts:
- dynatrace
ports:
- name: https-443
number: 443
protocol: HTTPS
resolution: DNS
which doesn't help the CSI-driver:
{"level":"info","ts":"2024-07-01T07:55:00.825Z","logger":"oneagent-image","msg":"failed to extract agent binaries from image via proxy","image":"dynatrace/dynatrace-codemodules:1.283.139.20240209-194956","imageCacheDir":"/data/cache","err":"getting image \"dynatrace/dynatrace-codemodules:1.283.139.20240209-194956\": Get \"https://index.docker.io/v2/\": read tcp 10.108.3.147:42910->3.219.239.5:443: read: connection reset by peer","errVerbose":"Get \"https://index.docker.io/v2/\": read tcp 10.108.3.147:42910->3.219.239.5:443: read: connection reset by peer\ngetting image \"dynatrace/dynatrace-codemodules:1.283.139.20240209-194956\""}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how to figure out if the URL is incomplete because its referencing a dockerhub image?
I have no clue really 😬
How the image library does it: https://github.com/google/go-containerregistry/blob/main/pkg/name/repository.go#L80-L87
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice catch, did not consider it. I will have a look 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did it now that way, this ONLY covers docker so I am not sure if we should go for it, pls let me know.
ea80b58
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still doesn't work: 😢 (probably need index.docker.io
)
{"error":"getting image \"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559\": Get \"https://index.docker.io/v2/\": read tcp 10.12.1.11:42066-\u003e34.226.69.105:443: read: connection reset by peer","level":"info","logger":"oneagent-image","msg":"pullImageInfo","stacktrace":"Get \"https://index.docker.io/v2/\": read tcp 10.12.1.11:42066-\u003e34.226.69.105:443: read: connection reset by peer
provisioner getting image \"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559\"","ts":"2024-07-03T12:20:54.687Z"}
provisioner {"level":"info","ts":"2024-07-03T12:20:54.687Z","logger":"oneagent-image","msg":"failed to extract agent binaries from image via proxy","image":"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559","imageCacheDir":"/data/cache","err":"getting image \"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559\": Get \"https://index.docker.io/v2/\": read tcp 10.12.1.11:42066->34.226.69.105:443: read: connection reset by peer","errVerbose":"Get \"https://index.docker.io/v2/\": read tcp 10.12.1.11:42066->34.226.69.105:443: read: connection reset by peer\ngetting image \"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559\""}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provisioner {"level":"info","ts":"2024-07-03T12:46:47.397Z","logger":"oneagent-image","msg":"installing agent from image"} │
│ provisioner {"level":"info","ts":"2024-07-03T12:46:47.397Z","logger":"oneagent-image","msg":"installing agent","target dir":"/data/codemodules/ZHluYXRyYWNlL2R5bmF0cmFjZS1jb2RlbW9kdWxlczoxLjI5My4xMzMuMjAyNDA2MTg │
│ provisioner {"level":"info","ts":"2024-07-03T12:46:47.885Z","logger":"oneagent-image","msg":"pullOciImage","ref_identifier":"1.293.133.20240618-095559","ref.Name":"index.docker.io/dynatrace/dynatrace-codemodule │
│ provisioner {"level":"info","ts":"2024-07-03T12:46:57.345Z","logger":"oneagent-image","msg":"unpackOciImage","sourcePath":"/data/cache/ZHluYXRyYWNlL2R5bmF0cmFjZS1jb2RlbW9kdWxlczoxLjI5My4xMzMuMjAyNDA2MTgtMDk1NTU │
│ provisioner {"level":"info","ts":"2024-07-03T12:46:57.345Z","logger":"oneagent-zip","msg":"extracting tar gzip","source":"/data/cache/ZHluYXRyYWNlL2R5bmF0cmFjZS1jb2RlbW9kdWxlczoxLjI5My4xMzMuMjAyNDA2MTgtMDk1NTU5 │
│ provisioner {"level":"info","ts":"2024-07-03T12:47:07.814Z","logger":"oneagent-zip","msg":"moving unpacked archive to target","targetDir":"/data/codemodules/ZHluYXRyYWNlL2R5bmF0cmFjZS1jb2RlbW9kdWxlczoxLjI5My4xM │
│ provisioner {"level":"info","ts":"2024-07-03T12:47:07.815Z","logger":"oneagent-image","msg":"unpackOciImage","targetDir":"/data/codemodules/ZHluYXRyYWNlL2R5bmF0cmFjZS1jb2RlbW9kdWxlczoxLjI5My4xMzMuMjAyNDA2MTgtMD │
│ provisioner {"level":"info","ts":"2024-07-03T12:47:07.897Z","logger":"oneagent-symlink","msg":"found version","version":"1.293.133.20240618-095559"}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Is your csi-driver injected by istio?
- Is your istio configured to be in restricted mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After configuring my istio as you suggested:
istioctl install --set meshConfig.outboundTrafficPolicy.mode=REGISTRY_ONLY
I can confirm that the csi driver is injected with the istio containers.
dynatrace-oneagent-csi-driver-5htgs 5/5 Running 0 5m55s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are going to love this (IMO, I don't even want to do anything like this, because this shows how unpredictable this whole scenario is)
It still does not work:
provisioner {"error":"getting image \"dynatrace/dynatrace-codemodules:1.293.133.20240618-095559\": Get \"https://auth.docker.io/token?scope=repository%3Adynatrace%2Fdynatrace-codemodules%3Apull\u0026service=registry.docker.io\": read tcp 10.12.3.29:43420-\u003e54.196.99.49:443: read: connection reset by peer","level":"info","logger":"oneagent-image","msg":"pullImageInfo","stacktrace":"Get \"https://auth.docker.io/token?scope=repository%3Adynatrace%2Fdynatrace-codemodules%3Apull\u0026service=registry.docker.io\": read tcp 10.12.3.29:43420-\u003e54.196.99.49:443: read: connection reset by peer"}
so just add auth.docker.io
, right?
nope, then you will get:
provisioner {"level":"info","ts":"2024-07-04T12:23:46.322Z","logger":"oneagent-image","msg":"saving v1.Image img as an OCI Image Layout at path","/data/cache":"Get \"https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/64/644c3f0771f461096ccb2c5ce4132447032b833b466cfd845c1a89fd4e5eb762/data?verify=1720098825-yv65IR1rXZ98%2FyHhfEAM%2FMx6cJY%3D\": read tcp 10.12.3.29:55420->104.16.99.215:443: read: connection reset by peer"}
so you have to add production.cloudflare.docker.com
as well
Sidenote:
- using a wildcard like
*.docker.io
or*.docker.com
is not that simple (you have to mess with the DNS within your application if I understand it correctly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in d23af95 although i do not like this solution at all tbh. Seems to constructed for this case...
closed this PR for now, as the way it is done here is way too hacky. We will revisit this and rework. |
Description
With this the CSI Driver is excluded from istio blocking traffic if:
codeModulesImage
is setpublic registry
is enabledVirtualService
andServiceEntry
is created that points to the host where the image is stored. If this is not the case the injection into a pod would not work as everything got stuck because the CSI Driver could not be able to retrieve the image.I placed the reconciliation of theistioReconciler
right before the injection reconciliation, please give me feedback if you are fine with this.I placed the reconciliation after the codeModules versionReconciler is finished.
Currently I am only checking if the
ImageID
is set in the status of theCodeModule
, this should cover both cases (codeModulesImage
set orpublic registry
enabled).Respective ticket: https://dt-rnd.atlassian.net/browse/K8S-9622
How can this be tested?
Deploy the operator with istio, do not forget to label the to be injected namespace accordingly (
istio-injection:enabled
)Apply a dynakube with istio enabled and
codeModulesImage
setpublic registry
feature flag enabledAlso check for
VirtualServices
andServiceEntries
and look if they are created correctly.