Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support proxy registries for each package type #21223

Open
OverkillGuy opened this issue Sep 20, 2022 · 13 comments
Open

Support proxy registries for each package type #21223

OverkillGuy opened this issue Sep 20, 2022 · 13 comments
Labels
topic/packages type/feature Completely new functionality. Can only be merged if feature freeze is not active. type/proposal The new feature has not been accepted yet but needs to be discussed first.

Comments

@OverkillGuy
Copy link

Feature Description

Spinning off #19270 (comment) into its own ticket as recommended

I wish Gitea supported "remote", or "proxy" repositories.

These are package repositories that proxy an external source of packages, hence configured with proxy URL, but are otherwise same as local package repositories, as they can be pulled from as usual.

Example: A local Pypi.org proxy. Local build system would be configured to use both the private package registry for "internal" (private) packages, but now fetching dependencies on Pypi.org through local Gitea too.

Advantages:

  • Shorter round-trip to fetch packages = faster build times
  • Improved auditability of dependencies (one place for all $internet_stuff)
  • Offline-able build systems helps with disaster recovery, privacy...
  • Mitigate bad/rogue updates by having solid cacheing

This feature in Docker repositories would remove any need for Dockerhub ECR mirror, which many have to set up to avoid Dockerhub's recent rate-limiting.

The canonical example of the feature is in JFrog's Artifactory.

Effectively, Gitea would, for these proxy repositories, become a local package cache. The biggest technical decision is about when to invalidate cache (docker image's "latest" tag moves pretty quickly, but if you already have a local copy, do you serve it as-is? even if you got it 2 years ago?)

Pushing this feature to its extreme, Artifactory provides Virtual Repositories that aggregate both remote (public proxies) and local (private to org) repositories into one place.

I understand this feature can be a big investment, and acknowledge that there may be no particular need for it. I mostly envy the feature, and wish for Gitea to succeed by out-executing Artifactory, given the new Package Registry is already encroaching on that a bit.

Screenshots

Artifactory remote repository
Artifactory cache advanced settings

@OverkillGuy OverkillGuy added type/feature Completely new functionality. Can only be merged if feature freeze is not active. type/proposal The new feature has not been accepted yet but needs to be discussed first. labels Sep 20, 2022
@OverkillGuy
Copy link
Author

Suggesting applying the label theme/package-registry, but I can't apply that on my own.

@lunny
Copy link
Member

lunny commented Mar 22, 2023

About what should the proxy looks like. A proxy package should have the same url and database structures as an original one but with a mirror column just like repositories and mirror repositories. So this package is readonly from user and there is an internal time to fetch from remote?

@springeye

This comment was marked as duplicate.

@kvaster
Copy link
Contributor

kvaster commented May 28, 2023

Also proxy should cache data from remote. That way you may be sure you'll be able to build your project even if data is reomved from remote.

@yekanchi
Copy link

is this going to be something like Sonatype-Nexus or JFrog-Artifactory?

@TimberBro
Copy link
Contributor

A proxy package should have the same url and database structures as an original one but with a mirror column just like repositories and mirror repositories.

Wouldn't it be superfluous to keep a link to a remote repository for each package?

@lunny How do you feel about the idea of having mirror settings at the organization level?
As example, for any type of registry, the owner can check whether the registry is a mirror or not and if it is, the owner can set the remote-URL.

@yekanchi
Copy link

A proxy package should have the same url and database structures as an original one but with a mirror column just like repositories and mirror repositories.

Wouldn't it be superfluous to keep a link to a remote repository for each package?

@lunny How do you feel about the idea of having mirror settings at the organization level? As example, for any type of registry, the owner can check whether the registry is a mirror or not and if it is, the owner can set the remote-URL.

I think we can merge both.

  • packages could be uploaded locally without upstream source (as is now)
  • if any package or package version is requested and there is no file for the specified version the upstream sources will be checked.

So there is no need to specify upstream source for every pacakge/

@PatrickHuetter
Copy link

This feature would be awesome. We are running a nexus repository server since a few years and migrated from gitlab to gitea. With this feature we could also get rid of the nexus and have a more all in one experience in our development tasks.

@KarenArzumanyan
Copy link

This is a highly requested feature.
We also use nexus now, which is very slow and has limitations in the oss edition.

From the characteristics of the registry operating in proxy mode:

  1. Caching received packages
  2. If the package is not found in the cache, then request from the external registry
  3. Periodically clean old packages according to conditions - if they are not used for so many days, for example, i.e. if there was no request for them.
  4. Cache size limit (when reached, the oldest packages are deleted)

We really hope for this feature.
Thanks.

@lunny
Copy link
Member

lunny commented Mar 12, 2024

I think we can have two types proxies, one is a feature of Gitea which can connect to the source packages directly and pull. Another is an external proxy which could be depolyed in a DMZ and can pull packages from external of the network and then push to Gitea.

@KarenArzumanyan
Copy link

Yes, a good option. It is important that the registry proxy has a cache to speed up the retrieval of packages, without having to request them from the outside each time.

@josh-hemphill
Copy link

I've been tracking this same thing in GitLab, and just found this here. Didn't see it mentioned, so I thought I'd link to their current implementation: https://docs.gitlab.com/ee/user/packages/package_registry/dependency_proxy/
They've only released it for maven packages in a beta; in the issue threads, they've been running into lots of issues pulling it off and it's got pushed back several times, so if it get's added in Gitea, hopefully the issues GitLab have run into can be avoided here.

@uvulpos
Copy link

uvulpos commented Jul 2, 2024

I would not enable this feature by default, so that the original url is not inside the gitea pull url. I would rather say as an administrator you can configure organisations like dockerhub oder pip so the url would be something like gitea.yourcompany.com/packages/dockerhub/docker/nginx/1.0.0.

Also for security reasons, you could define, which images are approved to pull, and which not (maybe also via wildcards or something?). Would improve security and compliance.

One thing I want to point out is: in the past there were projects that just disappeared over night and our software relied on it so it wasn't buildable anymore (or our infrastructure even deployable anymore, we had weird sys requirement admins ruling harbor). Harbor has also a caching mechanism but according to their documentation they delete the cached versions as well, if the main resource is not available anymore.

I would disagree. In rare cases you still want to use that software I would like to have an opportunity to set custom defined invalidation durations like e.g. not pulled for 6 months and flags for specific packages or packages versions that I have to delete manually

You just have to google for incidents. You'll find enough of them 🙁
https://www.darkreading.com/application-security/recent-code-sabotage-incident-latest-to-highlight-code-dependency-risks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/packages type/feature Completely new functionality. Can only be merged if feature freeze is not active. type/proposal The new feature has not been accepted yet but needs to be discussed first.
Projects
None yet
Development

No branches or pull requests