Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idea: implement an IPFS driver #1983

Open
goern opened this issue Jun 21, 2024 · 21 comments
Open

idea: implement an IPFS driver #1983

goern opened this issue Jun 21, 2024 · 21 comments

Comments

@goern
Copy link

goern commented Jun 21, 2024

As a user of the container storage library, I want to store layers as objects on IPFS, so that I can benefit from the distributed object storage of IPFS.

Rationale for Implementing IPFS in Container Image Storage Driver

Implementing the InterPlanetary File System (IPFS) for storing container image layers offers significant advantages in a distributed computing environment. By leveraging IPFS, each container image layer can be stored as an independent object across a decentralized network. This approach allows for the assembly of container images from multiple IPFS servers using layer references, enhancing flexibility and scalability.

One of IPFS's key benefits is its pinning feature, which ensures the efficient maintenance of the location and replication of container image layers. This enhances data availability and reliability and contributes to optimized storage management. Furthermore, utilizing IPFS can lead to a more equitable distribution of storage costs, as it enables a more accurate accounting of resources used for container image storage. This feature is particularly advantageous for organizations implementing cost-effective and transparent storage solutions.

References

https://docs.ipfs.tech/concepts/

@goern
Copy link
Author

goern commented Jun 21, 2024

Cc: @sallyom @rhatdan @vpavlin

@vpavlin
Copy link

vpavlin commented Jun 21, 2024

Hmm, I am not sure I understand this - container/storage repo is to store images localy - or not? Using IPFS for container registry makes sense to me and there is already an implementation of that https://github.com/ipdr/ipdr

Or is the use case you have in mind more towards a Kubernetes cluster where each k8s node also is/has IPFS node running and hence images (layers) can be distributed among them without each of them pulling it from an external registry (hence potentially incuring unnecessary cost?)

@goern
Copy link
Author

goern commented Jun 21, 2024

  1. I was under the impression that storage drivers also handle pulling blobs from remote locations?!
  2. ipdr is nice, but I think we could completely eliminate the requirement for having a registry, as we could have the manifest itself on ipfs
  3. ja, your last paragraph summarized one of the use cases.

@vpavlin
Copy link

vpavlin commented Jun 21, 2024

  1. I was under the impression that storage drivers also handle pulling blobs from remote locations?!

Ah, maybe, I have actually no clue:D

2. ipdr is nice, but I think we could completely eliminate the requirement for having a registry, as we could have the manifest itself on ipfs

Pardon my ignorance and that I did not see this immediately - this is true! Really cool idea!

@rhatdan
Copy link
Member

rhatdan commented Jun 24, 2024

Containers/storage supports additional/stores and additional Layers, which can be stored on networked base storage.

When it comes to pulling images, we use containers/image.

@rhatdan
Copy link
Member

rhatdan commented Jun 24, 2024

@mtrmac @nalind @giuseppe @saschagrunert Thoughts?

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 24, 2024

I didn’t try to build this and I have no numbers, but I just can’t see any end-user benefit.


Users who don’t want to place permanent files on nodes, and want to somehow deal with IPFS-located data, can already mount an IPFS filesystem, or, I don’t know, have an in-memory-only IPFS client. It’s not necessary to change anything about the infrastructure for that.

So we are talking about container-image content.


Then https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing items 1-3: We would never want to entirely replace the local extracted-layer-filesystem storage by a distributed system, in general; the distributed systems (incl. operators that drive operation of the network components interconnecting the nodes) are built on top of of container images running on nodes!


Replacing registries (which don’t store individual layer files separately, but as compressed tarballs) by individual compute nodes acting as distributed layer stores is not obviously beneficial either. (Compare also https://github.com/spegel-org/spegel , IPFS is not inherently the only way to do that.)

First, there’s no direct benefit to co-locating the compressed layer versions and extracted layer filesystems; the two are not direct substitutes. (containerd does co-locate them, making Spegel possible at low additional cost, but c/storage does not.)

Second, if a massively-parallel application were deployed across 100 nodes, would that mean we have 100 computers storing and serving the compressed layer over this filesystem? The 100 copies would be completely unnecessary waste of storage.

So, for distributing layers, it seems to me much better to have an “ordinary” clustered registry deployment (with admin-controllable number of replicas), on top of … well, any clustered filesystem, really. That could be IPFS, or it might be not, but either way there’s no need to involve c/storage at all.


Fine; if we forget about distributing images over nodes, and just talk about not using registries at all, and having “pull” operations directly interact with some object store (IPFS or not). Sure, that is plausible — but also an ecosystem-wide feature addition:

  • registry credentials (“pull secrets”) would now need to somehow allow carrying the distributed-object-store credentials, in parallel
  • there would need to be a mapping mechanism from an on-registry image name to an allowed/desired distributed object store, so that private clusters don’t broadcast the identity of their workload all over the internet if that’s not wanted.

And in the end, I don’t see that this is really any better than having a registry which, when asked for a blob, issues a HTTP redirect to a CDN (where the CDN can be backed by whatever filesystem you choose). That works today, and is widely deployed in practice.


What am I missing?

@sallyom
Copy link
Contributor

sallyom commented Jun 24, 2024

  • With decentralized storage, where data is stored becomes irrelevant, all that matters is who can access it. Access is managed cryptographically rather than by server-side access controls. It is a fundamental shift, but decoupling location from access control has benefits - flexibility, resiliency, consistency.
  • With centralized servers, security and access are dependent on the infrastructure. If the registry server goes down, you lose access to your images. P2P networks eliminate the single point of failure.
  • If podman could access images directly from IPFS as if they were stored locally, this would eliminate the need for traditional image pulls. This is particularly interesting to me.

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 24, 2024

  • With decentralized storage, where data is stored becomes irrelevant

It’s not really irrelevant; failure domains are something that need to be accounted for and designed. Unless the data were that excessively ubiquitous (similarly to how we think of internet packet loss as irrelevant because we have “good enough” hardware and ample extra bandwidth to retransmit). That excessive ubiquity might be true in the future but, almost certainly, isn’t true for container images at the moment.

all that matters is who can access it. Access is managed cryptographically rather than by server-side access controls.

I’m IPFS-ignorant and I don’t know what that means. Right now users have registry credentials. How does one go from registry credentials to “managed cryptographically” access?

Are you saying that anyone who knows the digest of a layer is assumed to be able to access a copy? That’s… not been an assumption in the current systems. It might be fine but it’s not obvious. E.g. Red Hat, on https://catalog.redhat.com , is publishing image manifest digests, AFAICS without requiring any login. Does that mean that if the images were stored on IPFS, that would make all of the image content public? (And if it does, “whose fault is that”?)

It is a fundamental shift, but decoupling location from access control has benefits - flexibility, resiliency, consistency.

I don’t know what that “flexibility, resiliency, consistency” means at a code level.

I am imagining a server with one or two outbound internet links, comparatively low-bandwidth and possibly congested, and a local network that is fully managed and much higher-bandwidth. Where does IPFS enter the picture? Over the congested link, or locally? If the former, that’s not sufficiently resilient. If the latter, that needs to be explicitly managed to contain everything necessary, and it’s basically “just a mirror“.

  • With centralized servers, security and access are dependent on the infrastructure. If the registry server goes down, you lose access to your images. P2P networks eliminate the single point of failure.

What is the economic model? With BitTorrent and illegally pirated movies, everyone involved has some interest in keeping the movies around, and there is maybe a bit of a reputation economy and a moral imperative to share alike between participants.

With container images … why would company A want to host a mirror of company B’s images? (Yes, I realize that model works for freely-distributable Linux distributions.)

  • If podman could access images directly from IPFS as if they were stored locally, this would eliminate the need for traditional image pulls.

Are you talking about replacing the compressed layer representation or the extracted-filesystem representation? This seems to be the latter; and as detailed above, that seems unnecessary (for application-managed data) or outright undesirable (for application files of infrastructure containers) to me.

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 24, 2024

(One possible response for the desire to avoid pulling entirely, and to access individual files from some external source, is that the Additional Layer Store is an out-of-process FUSE interface, and nothing prevents anyone from writing a new backend. Uh … except that the ABI of that interface has effectively changed in the last 2 months.)

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 24, 2024

Second, if a massively-parallel application were deployed across 100 nodes, would that mean we have 100 computers storing and serving the compressed layer over this filesystem? The 100 copies would be completely unnecessary waste of storage.

And, au contraire, if a cluster-critical operator is not performance-critical and runs in a single deployment, would that mean that we only have a single copy? Then the admins would still need to manage an explicit mirroring operation that ensures that every image has a sufficient number of replicas to achieve the desired HA properties.

I can see how “Nodes should just pull from each other without any need to manage mirrors and replicas” sounds attractive, but AFAICS the need to manage just isn’t avoidable.

@giuseppe
Copy link
Member

Last time I've looked into IPFS, I've not found a way to use files from other sources, in our case from the containers storage.

That is a big disadvantage because we'd need to keep around the images twice: one copy in the containers storage and one extra copy in the IFPS cache, so effectively it doubles the amount of storage required to store an image.

@vpavlin
Copy link

vpavlin commented Jun 27, 2024

IPFS is problematic from ecomonical perspective - as there is no explicit incentive for anyone to hold other than their own data - that is why we get centralized Pinning Gateways where you pay for a node where your CID is pinned (i.e. the content is stored). I agree that altruistic approach does not work very well here (unlinke with BitTorrent). Filecoin and in the future Codex solve these issues by adding monetary incentive for node operators to host other people data.

I'd say controlling access to data is a more complex topic in peer-to-peer networks - as the access cannot be simply "gated" by AuthN/AuthZ proxy, but it must be based on encryption - i.e. - you either have or do not have the right key to decrypt the data blob. There are existing solutions like various MPC (Multi-party Computaion) networks allowing you to prove you are eligible to decrypt (base on access rules bound to some cryptographical material - e.g. proving ownership of a particular privaty key) and then generating decryption keys for you. But yes, it complicates the system significantly:)

I feel like this issue turned quickly away from "open source and free software idea" into "enterprise solution exercise":) And I am not 100% sure whether a completely centralized enterprise entity could benfit from integrating IPFS - given the complexities of access control, extra bandwith consumption and generally just the mindset and inherent centralization of everything - the extra potential resilience might not be a big enough incentive to deal with this.

On the other hand in open source and free software world this exploration still has a significant merit - being able to access data privately, securely and without explicitly relying on a third party gateway is something that is increasingly gaining interest (at least in my social bubble:) ). Even curl added support for IPFS:)

It would be an interesting opt-in feature where I don't have to rely on a centralized registry, but communicate directly with other nodes running podman and serve image layers I personally use as well. Maybe there are layers marked as private which are not broadcasted on the p2p network as they were pulled from a private registry? Maybe if some layer is not found on the p2p network, it can be pulled from a centralized registry and then broadcasted (if not private)? Maybe IPFS is not the right solution - it is the most commonly used, but with the lack of any assurance of persitency of the data in the network, it might not be feasible for any real-world deployment?

@goern
Copy link
Author

goern commented Jun 27, 2024

I think vasek is heading the right direction, this request is not only about a new storage driver/technology and technical use cases we should solve. its about an update to the 'container image storage ecosystem'.

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 27, 2024

Ah, the C word. Have fun, but keep me out, and don’t @ me.


Container image encryption exists, but it has significant limitations right now (it doesn’t encrypt the image config, and it doesn’t really work for digest references); also deploying it is harder because the ecosystem (e.g. K8s Pod objects) is set up to distribute credentials, not keys.


I think “being able to access data privately … without explicitly relying” is completely disconnected from the nature of container images, which are opaque blobs on the order of hundreds of megabytes, practically impossible to black-bock inspect WRT what the binary does, and inherently imply a trust relationship to the producer of the images, WRT the existence, identity, and trustworthiness of that producer. (And/or reproducible builds, I guess. Still a trust relationship to whoever makes the reproducibility claim, because image users are not going to be re-building the image on every use.)

@cgwalters
Copy link
Contributor

(One possible response for the desire to avoid pulling entirely, and to access individual files from some external source, is that the Additional Layer Store is an out-of-process FUSE interface, and nothing prevents anyone from writing a new backend. Uh … except that the ABI of that interface has effectively changed in the last 2 months.)

Doesn't have to be FUSE, just any mountable network filesystem, right? At least one of my employer's important customers is doing this with an AFS mounted additional image store, for reference. One notable thing here though is that what we really want with something like this is a mechanism to tell the underlying filesystem that these aren't files, they're objects and hence can be cached for any arbitrary lifetime without worrying about cache invalidation. But I think today people doing this are OK with fscache.

@vpavlin
Copy link

vpavlin commented Jun 28, 2024

Ah, the C word. Have fun, but keep me out, and don’t @ me.

the C word? Cryptography? Cypherpunk? Centralised? Control? Sorry, I am lost here, can you please help me understand?

and inherently imply a trust relationship to the producer of the images

Correct, we are in agreement here - I never said you are not relying on the producer or that anything decentralized is trustless - trust is always involved. The goal though is to move the trust away from intermediaries. I said without explicitly relying on a third party gateway - i.e. I produce something, push it to a registry and then pull it elsewhere.

The thing we are talking about is to avoid pushing to a registry. And again, happy to reiterate - this is not for everyone, I did not even come up with this idea originally, but AFAU @goern's goal was to investigate if such a system would be possible, feasible and useful to some.

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 28, 2024

(One possible response for the desire to avoid pulling entirely, and to access individual files from some external source, is that the Additional Layer Store is an out-of-process FUSE interface

Doesn't have to be FUSE, just any mountable network filesystem, right? At least one of my employer's important customers is doing this with an AFS mounted additional image store

It’s confusing, but Additional Layer Store and Additional Image Store are two quite different c/storage features. With e.g.

func notifyReleaseAdditionalLayer(al string) {
, it seems to me that FUSE, or some other custom not-just-content filesystem, it the only way to operate ALS, at least without triggering warnings during normal operation. But I didn’t look too deeply.

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 28, 2024

Ah, the C word. Have fun, but keep me out, and don’t @ me.

the C word?

Cryptocurrency.

The goal though is to move the trust away from intermediaries.

That’s what end-to-end signatures provide. They exist today.

@phillebaba
Copy link

I just want to add my two cents to the mix.

I have spent some time exploring different types of decentralized alternatives to registries. Mostly because I have lived through enough problems with rate limiting and outages of critical registries while at the same time being too lazy run my own mirrors. Some of my laziness comes from the distribution spec not setting a standard for how mirroring should work. Resulting in multiple solutions from different vendors. Either solved by mirror configurations or mutating image references. None of which are compatible in a broader spectrum.

I was very hopeful when I first found the different IPFS solutions to distributed OCI artifacts. The seemed partially to solve the problems. However they come with their own set of challenges, some of these have been raised by @mtrmac. I do want to be proven wrong but I have not yet seen anyone make use IPFS based image pulling at a large scale, and I mean this both from "enterprise" performance aspect as well as a usability perspective. If someone has a public example of this please do share.

I created Spegel as a compromise solution between distributing images through IPFS and using a centralized registry. In my head it is the best between both worlds. While I have tried supporting CRIO and related projects it is not currently possible as has been pointed out. That is also to discuss in a separate issue.

To end things I would suggest initially going the route I did and building a gateway registry that can pull from non traditional source. If nothing else but to prove that it actually works. The work of adding alternative image pulling logic as far as I understand the code would require some work. Which is why I see a gateway as a better alternative if you are after building a POC.

@goern
Copy link
Author

goern commented Jul 4, 2024

I get the point about a gateway registry, especially for gating into the traditional world of OCI registries.

Nevertheless, I think that the topics of auth and cost need some research. Therefore, implementing an alternative image-pulling logic is one of two key aspects of this idea-issue. the second one is finding a new economical base for sharing containers.

From the list of commenters on this issue, I reckon that we all have a very good understanding of the infrastructure (and associated costs) being required to operate registries. And from an open-source and sovereignty point of view we need to work on a decentralized and uncensorable infrastructure to store our software.

... ja I know, I go a little out of the bounds of the tech domain with this. and I am not speaking for my employer. ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants