-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Suggestion] - CID deny / allow API #7871
Comments
In the cases where a node joins a bitswap session with one peer and in the process learns about additional connections to make, it's connection to the node asking for bitswap content may not directly have HTTP headers. What headers (and where have the originally come from?) are you hoping to use for making this decision? |
For context, our backend currently uses the js-ipfs-http-client for our requests to go-ipfs. With this library you can pass in custom headers: https://www.npmjs.com/package/ipfs-http-client#custom-headers The concept is that these headers could take the form of any string.
I suppose there could be a situation where two nodes have the same content and have separate policies for what's required to serve that content. In those scenarios, I wouldn't expect the host node that cannot serve the content to either:
How this would work within the context of bitswap is a little outside my scope of expertise, but those are the general flows I had imagined. |
This proposal is interesting from the perspective of IPFS hosts and others provider services on top of IPFS. I think the main benefit here is that it provides a means for an IPFS host provider to "plug in" their own Bitswap "rules" that can protect them, while also being useful for implementing private networks and other security conscious scenarios. While I'm not yet married to any particular implementation or API spec, something that allows one to run a node with a custom Bitswap "interceptor" would be ideal. The interceptor function would then receive some context, be it Bitswap request, some headers, whatever. Headers might not be the right wording here. In fact, it might require updates to the Bitswap protocol to accept an additional "extentions" map of some kind? |
Notes from 2021-03-22 discussion: This should maybe be two issues?
This will likely get closed and split. |
My use case is "I can't allow connections from Syria or Iran due to export control restrictions" At the moment I'm using GeoIP blocking of the entire machine, but this really should be configured per-object as not all of them fall under export control restrictions. |
This is also required to share data between nodes of friends and own nodes, as described in ipfs/roadmap#78 |
The question comes to mind, how should a node publish this information to the DHT? If there's any restrictions, say to share a file only with one other node, but the file is very popular, the node might get a lot of requests which will be denied. Maybe add a flag to the DHT entries if a CID has any access restrictions, to allow other nodes to choose first the nodes which might not restrict the access. |
Yep, @BigLep I'm pretty sure this needs to be broken out into separate issues. For example,
Sure, adding more features to provider records makes sense. Although it's mostly independent from this issue since we already need more information. For example, what if someone only has part of a large DAG instead of the whole thing? We might want to mark that in the provider record entry. IMO it's related to, and would likely occur at the same time as, libp2p/go-libp2p-kad-dht#584. Note: I've also seen requests (although I'm having trouble locating the issue ATM) for client-side deny lists. For example, users or public gateways that just want to avoid downloading certain content even as a transitive dependency of another graph. That would be a separate feature request, this one seems to be focused on server-side filtering. |
The ability to have a hook between request of a CID and serving the data out for that CID is something that would help us a lot at Microsoft, as well as others in the Decentralized Identity Foundation. Even something as simple as an inbound async hook where you can get the CID being requested, do some eval in a function, and return true/false as to whether it should be released would be a big help. |
We've thought through how we would use this in Peergos more now, and and we're pretty sure we can get full post-quantum fine-grained capability-based access control to ciphertext with a single (cid specific) auth string that bitswap sends with each request. The receiver then passes (cid, requestor nodeId, auth string) to the API as above. This should also be pretty general as long as the length limit on the auth string isn't too low (you could always encode multiple headers into it if you wanted). |
@momack2 Would Protocol Labs be interested in funding this? If so, we could do the work to extend bitswap, and add an external allow(cid, nodeId, auth) call. @csuwildcat Would Microsoft be interested in funding it? |
Thanks for the ping, Ian! Adding @autonome on that question. Note the Cloudflare team recently OSS'd their gateway operations tooling for handling allow/deny lists on their infra, which may also be pushing in similar directions |
Thank you, @momack2 Yep, I've looked at Cloudflare's work, though I think it's mostly orthogonal to this work. |
@momack2 I also took a look at that part of Cloudflare's stack and it's not really the same thing as what this API would be. What we want here is a simple, native IPFS API that runs every incoming CID resolution through an async function that can run user defined logic to determine whether or not it wants to respond with the data that backs a given CID. It's a rather fundamental config/core hook that would enable the creation of an unlimited number of boundaries between logical groups of CIDs in a node and the outside world. For our personal datastore use, it's an absolute must that we have a way to virtually group and filter the decision to service requests for CIDs/data to the wider network. |
@momack2 @autonome Don't worry about this any more. We've finished implementing it - including the bitswap extension to add an auth string and the customisable allow(cid, peerID, auth) function. The bitswap extension happens to be backwards and forwards compatible with existing instances too. Finally, block level privacy on IPFS! This (used correctly) removes the biggest legitimate criticism of private storage applications on IPFS compared to centralised services. |
@ianopolous will you be doing a PR that includes this? |
Amazing to hear, Ian! Adding @aschmahmann to help get this upstreamed!
Definitely still happy to help retroactively reward great work that many
folks benefit from. Do you want to estimate a dev grant amount in an issue
here? https://github.com/ipfs/devgrants
…On Sat, Oct 30, 2021 at 8:31 PM Daniel Buchner ***@***.***> wrote:
@ianopolous <https://github.com/ianopolous> will you be doing a PR that
includes this?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7871 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEXAF2BELOEFZCXAVN3P2TUJSE6JANCNFSM4WKCNHWA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
@ianopolous thanks for your work here! Would definitely love to have an implementation of Bitswap that works well with auth/token based access beyond just peerIDs. If there's some open source repos I can look at, or generally a proposal for how you'd like to modify the Bitswap wire protocol that'd be great.
I'm a little concerned about what this means. For example, if you're taking advantage of protobuf optional fields for your extensions then compatibility requires agreement on which field numbers are reserved and what they mean. I wouldn't want there to be some collision in a future iteration of Bitswap that makes upgrading painful for users of your fork. If we can figure out an upstream then we won't have to worry too much about that though 😄. |
Hi @momack2 , sorry for the delay - I have a newborn to look after. Thank you for the generous offer! I think it makes sense to fully integrate it into Peergos first to make sure it 100% satisfies our needs before considering the extra work and re-licensing necessary to upstream it. @aschmahmann , My understanding is that there aren't any major changes planned for bitswap on the protobuf level, and that the plan is rather to migrate to graph-sync? If there are changes then we can just agree not to overlap each other's protobuf indices as you say. It's also not the end of the world if we just end up in a fork either. We have also changed some of the core interfaces around getting blocks (so we can use the type system to make some security guarantees) which would be much harder to integrate into go-ipfs than ipfs-nucleus (our super-minimal drop-in ipfs replacement), which took only 3 hours. |
I don't think it's a good model to think of graphsync as "the next gen bitswap" but as a different data transfer protocol focused on moving DAGs around rather than blocks. Both types of protocols have their utility. As a simple (if slightly esoteric) example block based transfer can work without the server side understanding the IPLD codecs of the data, whereas DAG based transfer cannot.
I don't know what "major" means. There are currently no PRs to the specs repo proposing modifications to the Bitswap protocol., but as you've noticed in this issue, and some previous ones, having authentication within the Bitswap protocol is something that gets requested. So having a protocol change here seems like fair game.
I'm less worried about this. The important thing is the protocol change, if plumbing through the existing libraries is hard then that's ok we're allowed to have multiple implementations of the same protocol 😄. I suspect the spec change's here are pretty easy/straightforward to discuss in a spec PR. A little bit of iteration on this with some folks last week generated some simple options like:
and adding new responses to the |
@aschmahmann Isn't the degenerate case of graph-sync exactly bitswap? The case where the selector is just for a single block? We've gone with the first option which is to include an optional auth string with each want request (happy to share the protobuf). This is the most powerful and allows us to keep the whole thing totally stateless so that requests can be authorised or denied with nothing but the block itself. This means we can also maintain the auto-scaling properties of IPFS where anyone who has a block can also serve it up, applying the same auth scheme. We decided not to add any new return types to not leak whether or not we have a block (this clearly needs to be coupled with only providing the roots). There are some subtle vulnerabilities specific to the bitswap architecture that need to be guarded against too. |
We've fully integrated this into Peergos now, and it works great! I've submitted a corresponding spec change (after discussion with @aschmahmann) to what we settled on here: Our auth is 89 bytes, post-quantum and capability-based using S3 V4 signatures. Note that our allow function ended up requiring the block data as a parameter as well, though you could make that optional for other auth schemes. |
A minimal implementation of IPIP-383 from #10161 landed in master branch and is scheduled to be released in Kubo 0.24-rc1 for feedback. More details in |
For other people looking for this, it is also implemented in Nabu now (both authed bitswap and the http allow/deny API). |
During the "IPFS / IPLD Security & Encryption Workshop" that came out of ipfs/roadmap#65, it was discussed that many projects would benefit from a generic API that IPFS can call before serving a block over Bitswap. The purpose of this API would be to allow project to define their own privacy / permission controls within IPFS.
The proposed solution would simply be an option within IPFS that allows the node owner to provide an API to call before each block gets served (possibly also advertised). This API would simply return "true" or "false" depending on whether or not the block should be served. This API could be local or external depending the node's performance needs.
The API would take in the following as parameters:
It would be up to the node owner to figure out how to return "true" or "false" to this request.
The idea here is that with a generic API, the IPFS dev team doesn't have to dedicate time / resources to determining an IPFS content permission system that works for every possible use case. They can instead let teams decide what works best for them and keep the complexity out of the go-ipfs repo.
Created at the request of @willscott in: ipfs/roadmap#65. cc @aschmahmann
Additional Resources:
The text was updated successfully, but these errors were encountered: