Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Umbrella] Considerations around a mirror registry #1369

Closed
justaugustus opened this issue Dec 2, 2020 · 28 comments
Closed

[Umbrella] Considerations around a mirror registry #1369

justaugustus opened this issue Dec 2, 2020 · 28 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-priority sig/release Categorizes an issue or PR as relevant to SIG Release.
Milestone

Comments

@justaugustus
Copy link
Member

justaugustus commented Dec 2, 2020

What would you like to be added:

Mirroring for popular non-Kubernetes community images that are used across the project.

Why is this needed:

Keying off of some recent discussions, specifically around Docker Hub changes, some community members have requested mirrored images or that we build our own for critical release and testing components.

Some recent discussions include:

I've created a staging repository for this in kubernetes/k8s.io#1441, but how to approach implementation is undecided at the moment.

For now, I'd like to collect some feedback on what contributor requirements are before we do anything else.

cc: @kubernetes/sig-release @kubernetes/k8s-infra-team @kubernetes/sig-testing

ref: https://kubernetes.slack.com/archives/CCK68P2Q2/p1605906060178100, kubernetes/kubernetes#95567, kubernetes/test-infra#19477, https://www.docker.com/pricing/resource-consumption-updates, https://cloud.google.com/container-registry/docs/pulling-cached-images

@justaugustus justaugustus added kind/feature Categorizes issue or PR as related to a new feature. sig/release Categorizes an issue or PR as relevant to SIG Release. labels Dec 2, 2020
@BenTheElder
Copy link
Member

I think @spiffxp filed a similar issue the other day.

Ideally I'd like to move things to primary hosting on a registry we're comfortable with, instead of configuring mirrors (this may be what you meant already, but I want to be clear on that point).

e.g. for e2e.test we want to get all of the images we use into k8s.gcr.io, even if that means just copying them over and updating the references. cc @claudiubelu @wilsonehusin

@justaugustus justaugustus self-assigned this Dec 2, 2020
@justaugustus
Copy link
Member Author

I think @spiffxp filed a similar issue the other day.

Very likely; just wanted to tie a bunch of threads together in a SIG Release context, so RelEng can start chunking the work in the new year.

Ideally I'd like to move things to primary hosting on a registry we're comfortable with, instead of configuring mirrors (this may be what you meant already, but I want to be clear on that point).

e.g. for e2e.test we want to get all of the images we use into k8s.gcr.io, even if that means just copying them over and updating the references.

Agreed. I think there will likely be a few categories:

  • images that we push, but to Google Infra and need to transfer them over to the Community (out of scope here, but in scope for --> Registries used in k/k CI should be on Kubernetes Community infra k8s.io#1458)
  • images that are further "upstream" e.g., Docker Hub / Quay that we need to be resilient to pull failures of (candidates for mirroring)
  • images that we depend on, but could maybe eventually build ourselves? e.g., Golang (mirror first, then create our own)

@justaugustus justaugustus added this to the v1.21 milestone Dec 2, 2020
@justinsb
Copy link
Member

justinsb commented Dec 2, 2020

@hakman pointed out to me that containerd doesn't release ARM artifacts (https://github.com/containerd/containerd/releases/tag/v1.3.9) and AIUI points users to docker for ARM builds. I can see a lot of things being in the 3rd category ("we build it to normalize it").

@justaugustus
Copy link
Member Author

@justinsb -- absolutely! I think another great example along the arch side is distroless: GoogleContainerTools/distroless#583.

When multi-arch images were introduced, they were only for arm64, which broke us.
@dims and @mattmoor did some great work to get us back into a good state.

Not necessarily saying that distroless is on the list of images needing to be mirrored, but it is an example of an assumption causing a break in our workflow.

@justaugustus
Copy link
Member Author

Sent a note to k-dev to get additional feedback here: https://groups.google.com/g/kubernetes-dev/c/198cwXYDtjc

@sftim
Copy link
Contributor

sftim commented Dec 2, 2020

An OCI image index seems like a nice idea: “I don't care where you get it from, but the manifest had better have exactly this hard-to-collide-with checksum”.
So, I'd like an approach that works for now and leaves room for more improvements.

@smarterclayton
Copy link
Contributor

Great topic - I’ll add kubernetes/kubernetes#93510 which is what openshift is now using to support offlining all images used by a test framework for disconnected users.

I am very incentivized to work with folks to make this easier for everyone.

@smarterclayton
Copy link
Contributor

Another use case that isn’t mentioned here exactly - downstream distros sometimes can’t trust upstream community images (or needs a process to rebuild them years or decades later), so being able to control the source of images as part of a unified process is also desirable, which requires mirroring and an audit list of every image used.

@justaugustus
Copy link
Member Author

downstream distros sometimes can’t trust upstream community images (or needs a process to rebuild them years or decades later), so being able to control the source of images as part of a unified process is also desirable, which requires mirroring and an audit list of every image used.

Love this, @smarterclayton!
Artifact management is high on my priority list for 2021 and I think as we continue to build a cohesive story around it, authorization/attestation definitely comes into play.

@mcluseau
Copy link

mcluseau commented Dec 2, 2020

Loosely connected to the problem of isolating from upstream issues, but I recently had to solve the problem of reducing the bandwidth usage of the pulls all around clusters, every time an image moves or a daemonset is deployed. This also allows for faster restart delays since the blobs are locally cached. Both those issues were important for my customers.

That bandwidth usage is mainly the blobs so I wrote a pull-through cache recording those blobs and serving them directly. Supports parallel pulls of the same blob, and peer-checking to avoid hitting the upstream while allowing high availability.

https://github.com/mcluseau/docker-registries-mirror

As it is referenced as a "mirror" in containerd terminology, it may or may not be a use-case here ^^

@thockin
Copy link
Member

thockin commented Dec 4, 2020

WRT "be able to rebuild years later" - This roughly means forking every source repo for every image we depend on transitively ad infinitum, right? Otherwise we simply do not have that ability. Github repos DO disappear occasionally. I made the same argument about godeps a long time back, and the discussion swirled on how much energy it takes to maintain such a beast.

I do think that we, as a project used by a lot of people, owe our users our best efforts around sanity here. Having deps on images we didn't build is bad. Having deps on infra we don't control is worse.

That said, our project pays for this stuff and serving billions of image pulls "ain't cheap". So we need to be careful not to position our "mirror" as a free (as in beer) dockerhub. It needs to be scoped to JUST things we need for the project to operate (tests, etc) and for it to be installed - a "standard" k8s install should not need to touch dockerhub, but if a user puts their own images there, that's on them.

I do not think we have funding, staffing, or mandate to operate a free "mirror" of any significant fraction of dockerhub.

@Goclipse27
Copy link

@justaugustus - Is this feature specific addressing e2e/test/Other specific identifiable pull and run or wherever possible but cannot avoid significant fraction for dockerHub as @thockin mentioned?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 5, 2021
@justaugustus justaugustus removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 7, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 5, 2021
@eddiezane
Copy link
Member

/remove-lifecycle stale
/cc @hh

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 5, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 3, 2021
@justaugustus justaugustus removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 17, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 17, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 16, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Artifact Management (SIG Release) automation moved this from To Do to Done (1.21) Feb 15, 2022
@riaankleinhans
Copy link

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 15, 2022
@riaankleinhans
Copy link

/reopen

@k8s-ci-robot
Copy link
Contributor

@Riaankl: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Feb 15, 2022
Artifact Management (SIG Release) automation moved this from Done (1.21) to In Progress Feb 15, 2022
@BenTheElder
Copy link
Member

I believe this is about or already ready to be closed. We've largely already mirrored the images we use into k8s.gcr.io such thay we're not directly dependent on any other registery for release builds etc.

This is not to be confused with related discussions about what backs our registry and potentially using mirrors for that, there are other tracking issues for that topic.

AFAICT SIG testing and release already made the motion to move build and e2e to consume images from our own registery only.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 16, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 15, 2022
@riaankleinhans
Copy link

I believe this is about or already ready to be closed. We've largely already mirrored the images we use into k8s.gcr.io such thay we're not directly dependent on any other registery for release builds etc.

This is not to be confused with related discussions about what backs our registry and potentially using mirrors for that, there are other tracking issues for that topic.

AFAICT SIG testing and release already made the motion to move build and e2e to consume images from our own registery only.

/close

@k8s-ci-robot
Copy link
Contributor

@Riaankl: Closing this issue.

In response to this:

I believe this is about or already ready to be closed. We've largely already mirrored the images we use into k8s.gcr.io such thay we're not directly dependent on any other registery for release builds etc.

This is not to be confused with related discussions about what backs our registry and potentially using mirrors for that, there are other tracking issues for that topic.

AFAICT SIG testing and release already made the motion to move build and e2e to consume images from our own registery only.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Artifact Management (SIG Release) automation moved this from In Progress to Done (1.21) Jun 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-priority sig/release Categorizes an issue or PR as relevant to SIG Release.
Projects
No open projects
Development

No branches or pull requests