Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggcr: Multiple k8s pull secrets for same registry doesn't work #1431

Open
dprotaso opened this issue Aug 15, 2022 · 6 comments
Open

ggcr: Multiple k8s pull secrets for same registry doesn't work #1431

dprotaso opened this issue Aug 15, 2022 · 6 comments
Labels
bug Something isn't working lifecycle/frozen

Comments

@dprotaso
Copy link
Contributor

Describe the bug

See: knative/serving#13126

To Reproduce

See linked issue:

@dprotaso dprotaso added the bug Something isn't working label Aug 15, 2022
@dprotaso
Copy link
Contributor Author

cc @jwcesign

@imjasonh
Copy link
Collaborator

#1280 was supposed to handle the case where auth is correctly configured for registry.com/foo and registry.com/bar and possibly also just registry.com -- if your request is for /foo, you should get creds for foo, i.e., as specific as possible. If your request is for /baz you get creds for registry.com. This assumes that auth is configured correctly in all cases, and we should just be as specific as possible in selecting among those valid creds.

If there's a bug in that logic though, we should fix it -- the kaniko issue that inspired that change (GoogleContainerTools/kaniko#687) was reopened because the behavior persisted.

This doesn't account for the case where there are two creds configured both for the same registry(+repo), but one is invalid and the other works. It sounds like kubelet will collect all matching auths and try them in a loop, and ggcr (currently) doesn't -- it will choose one at ~random and use it, even if it fails and the other would have worked.

It will be hard, but not impossible, for pkg/v1/remote to handle this fallback scenario, and try creds for /foo then fallback to creds for / if it gets a 403 response. That should match kubelet's behavior linked from the Knative issue above.

@jwcesign
Copy link

jwcesign commented Aug 15, 2022

I did test as follows:
first create secret:

root@cesign [12:38:20 AM] [+47.0°C] [~/git/group-service-acceleration] [main *]
-> # kubectl create secret docker-registry jwcesign \
--docker-server=https://index.docker.io/v1/ \
--docker-username=jwcesign \
--docker-password=xxx \
--docker-email=jwcesign@163.com

root@cesign [12:40:14 AM] [+45.0°C] [~/git/group-service-acceleration] [main *]
-> # kubectl create secret docker-registry cesign \
--docker-server=https://index.docker.io/v1/ \
--docker-username=cesign \
--docker-password=xxx \
--docker-email=jwcesign@gmail.com

then create service:

root@cesign [12:38:59 AM] [+47.0°C] [~/git/group-service-acceleration/config] [main *]
-> # cat group-service-c.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: group-service-c
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "1"
        autoscaling.knative.dev/maxScale: "1"
    spec:
      containerConcurrency: 1
      timeoutSeconds: 30
      imagePullSecrets:
      - name: jwcesign
      - name: cesign
      containers:
        - image: cesign/secret-test:latest
          env:
            - name: NOW_SERVICE
              value: "group-service-c"
            - name: NEXT_SERVICE
              value: ""

It will failed with 401 when resolving, but if with registries-skipping-tag-resolving, it works;
So from this testing, I think just fix resolving with loop check with creds.

It will be hard, but not impossible, for pkg/v1/remote to handle this fallback scenario, and try creds for /foo then fallback to creds for / if it gets a 403 response. That should match kubelet's behavior linked from the Knative issue above.

I think even with this implementation, the problem can't be fixed

cc @imjasonh @dprotaso

@Wouter0100
Copy link

It sounds like kubelet will collect all matching auths and try them in a loop, and ggcr (currently) doesn't -- it will choose one at ~random and use it, even if it fails and the other would have worked.

Correct, see here.

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Keep fresh with the 'lifecycle/frozen' label.

@jwcesign
Copy link

lifecycle/frozen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working lifecycle/frozen
Projects
None yet
Development

No branches or pull requests

4 participants