unable to pull images from private registry inside vcluster #33

eranbibi · 2020-08-13T12:54:38Z

Hi

I am trying to deploy an application in which the image is hosted on a private ACR registry. it is failing because of an image pulling issue.

I set up kubenetes secret as following
kubectl -n aqua create secret docker-registry registry-creds --docker-server=#####.azurecr.io --docker-username=##### --docker-password=##### --docker-email=#####

and the deployment of the application is set with
imagePullSecrets:
- name: registry-creds

And service account that is using the same image pull secret

Deployment failed with
aqua-db-c988798b4-j8vsn 0/1 ImagePullBackOff 0 15s

the describe pod Events:

Type Reason Age From Message

Normal Scheduled 13m default-scheduler Successfully assigned aqua/aqua-db-c988798b4-j8vsn to gke-gke8010-default-pool-9eb8eb33-pnbd
Warning SyncError 13m pod-syncer Error updating pod: Operation cannot be fulfilled on pods "aqua-db-c988798b4-j8vsn": the object has been modified; please apply your changes to the latest version and try again
Normal BackOff 11m (x6 over 13m) kubelet, gke-gke8010-default-pool-9eb8eb33-pnbd Back-off pulling image "#####.azurecr.io/database:5.0.0"
Normal Pulling 11m (x4 over 13m) kubelet, gke-gke8010-default-pool-9eb8eb33-pnbd Pulling image "#####.azurecr.io/database:5.0.0"
Warning Failed 11m (x4 over 13m) kubelet, gke-gke8010-default-pool-9eb8eb33-pnbd Error: ErrImagePull
Warning Failed 11m (x4 over 13m) kubelet, gke-gke8010-default-pool-9eb8eb33-pnbd Failed to pull image "#####.azurecr.io/database:5.0.0": rpc error: code = Unknown desc = Error response from daemon: Get https://*****.azurecr.io/v2/database/manifests/5.0.0: unauthorized: authentication required, visit https://aka.ms/acr/authorization for more information.
Warning Failed 2m55s (x43 over 13m) kubelet, gke-gke8010-default-pool-9eb8eb33-pnbd Error: ImagePullBackOff

The above is working well, on any “regular” cluster

FabianKramm · 2020-08-14T07:16:28Z

@eranbibi thanks for reporting this issue! Thats actually a bug in the virtual cluster where the imagePullSecrets names are not translated into the real existing physical ones, which results in the error message because the secret couldn't be found in the host cluster. I'll fix this and make a new release.

FabianKramm · 2020-08-14T08:29:30Z

@eranbibi this issue should be fixed with loft version v0.3.7! The easiest way is to just create a new virtual cluster for the fix to take effect. If you want to upgrade an existing virtual cluster, you have to modify the CRD settings, which can be done in the UI through the Show Yaml button and add

chart:
  version: 0.0.1-beta.20

and press 'Update' as seen in this screenshot:

eranbibi · 2020-08-14T16:44:36Z

Hi @FabianKramm

First, thank you for the super quick response and new release.

I was creating a new loft environment with latest loft v0.3.7 (deployed using your helm chart)

I created new space and new vc, and when I deployed my app I faced the exact same issue of Error: ErrImagePull

I doubled confirmed that I am indeed using v0.3.7 as it seems the fix is not there.

Then I was trying to follow your other suggestion and adding the

chart:
version: 0.0.1-beta.20

To the vc yaml file.

But when I hit “Save” I got an error message

Failed to save state in cluster real-cluster-gke1
Error: Operation cannot be fulfilled on virtualclusters.storage.loft.sh "csp3-vc": the object has been modified; please apply your changes to the latest version and try again (Conflict)

What are you suggesting?

FabianKramm · 2020-08-15T07:20:16Z

@eranbibi Regarding the first issue, that is odd, I tested it on my install and it worked for me. Can you check in the host cluster where the vcluster was created in, how the pod yaml looks (kubectl get pods -n vcluster-NAME -o yaml) and if the referenced secret under imagePullSecrets exists in the namespace? Would be good if you could post the yamls here.

Regarding the second issue: that usually occurs if there was an update to the crd in between your modification and pressing Update to the resource, we probably should change that to a patch instead of an update, however usually this is solved by just refreshing the table and reapplying the update to the resource

eranbibi · 2020-08-15T15:42:12Z

Hi @FabianKramm
I run the following command on the host cluster context

kubectl get pod aqua-db-7c878cfc5b-fs29c-x-aqua-x-vc1 -n dev-space1 -o yaml

see attached vs1_db_pod.txt

there is
imagePullSecrets:
- name: aqua-registry-x-aqua-x-vc1

And i confirm the secret exist on the name space using the following command
kubectl get secrets -n dev-space1

let me know what additional information i should provide, or even getting you an access to my cluster if needed.

Edit: @FabianKramm I think this is related to a service account configuration that doesn't populated to the real cluster

in the virtual cluster i am creating the deployment with sa called "aqua-sa" (and a the pull secret on this sa)

ubuntu@gke8040-3064:~$ kubectl get sa -n aqua
NAME SECRETS AGE
default 1 48m
aqua-sa 1 48m

on the host cluster i dont see it

ubuntu@gke8040-3064:~$ kubectl get sa -n dev-space1
NAME SECRETS AGE
default 1 52m
vc-vc1 1 52m

can you check my assumption

thanks,
Eran

FabianKramm · 2020-08-17T07:53:56Z

@eranbibi I investigated a little bit more and it seems that sometimes the secret type was not correctly synced, which caused your issue. I fixed it and this time it should work (loft v0.3.9), I tested it with multiple configurations.

Regarding the service accounts: we don't need to sync them, because only the service account from the virtual cluster is used (only the secret of the service account is needed as it is bound as a volume which will also be synced to the host cluster) instead of the host cluster which is why they are not needed in there. Pull secrets specified in the virtual service account are automatically applied to the pod configuration which are then correctly translated to the host secrets and will work.

eranbibi · 2020-08-17T14:16:51Z

Hi @FabianKramm

i was able to confirm that the issue was resolved. thank you.

FabianKramm added the kind/bug Something isn't working label Aug 14, 2020

FabianKramm self-assigned this Aug 14, 2020

FabianKramm closed this as completed Aug 14, 2020

FabianKramm reopened this Aug 15, 2020

eranbibi closed this as completed Aug 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unable to pull images from private registry inside vcluster #33

unable to pull images from private registry inside vcluster #33

eranbibi commented Aug 13, 2020 •

edited

Loading

FabianKramm commented Aug 14, 2020 •

edited

Loading

FabianKramm commented Aug 14, 2020

eranbibi commented Aug 14, 2020

FabianKramm commented Aug 15, 2020

eranbibi commented Aug 15, 2020 •

edited

Loading

FabianKramm commented Aug 17, 2020 •

edited

Loading

eranbibi commented Aug 17, 2020

unable to pull images from private registry inside vcluster #33

unable to pull images from private registry inside vcluster #33

Comments

eranbibi commented Aug 13, 2020 • edited Loading

FabianKramm commented Aug 14, 2020 • edited Loading

FabianKramm commented Aug 14, 2020

eranbibi commented Aug 14, 2020

FabianKramm commented Aug 15, 2020

eranbibi commented Aug 15, 2020 • edited Loading

FabianKramm commented Aug 17, 2020 • edited Loading

eranbibi commented Aug 17, 2020

eranbibi commented Aug 13, 2020 •

edited

Loading

FabianKramm commented Aug 14, 2020 •

edited

Loading

eranbibi commented Aug 15, 2020 •

edited

Loading

FabianKramm commented Aug 17, 2020 •

edited

Loading