Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query hash of CIDs to protect content privacy #706

Closed
willscott opened this issue Jan 21, 2021 · 4 comments
Closed

Query hash of CIDs to protect content privacy #706

willscott opened this issue Jan 21, 2021 · 4 comments

Comments

@willscott
Copy link
Contributor

Currently, queries are made to find providers for a cid directly. This means any participant in the DHT may learn about the request for the content, and it is trivial for someone to learn about what content is being requested by which nodes.

We can do better (referenced in recent security discussions).

  • When a client makes a query for a CID (or more generally when CIDs hit the DHT interface, they should be hashed, and queries should be made and answered for the hash of the CID rather than the CID itself.
  • The response should be encrypted using a key of the un-hashed CID.

For a node that does not already know a CID, it will not be able to decrypt the response from a query to itself learn who the providers for that CID are, or generate a list of popular CIDs directly.

@hsn10
Copy link

hsn10 commented Aug 3, 2021

I do not think its good idea to implement CPU intensive stuff now. Large IPFS nodes have already pretty high CPU demands which limits their scalability. IPFS use case is to serve as public data storage, there is no need to put anonymizing stuff in.

@bertrandfalguiere
Copy link

I guess a compromise would be to make that opt-in, with node advertizing at protocol-negociation time if they support hashed CID serving?
I also guess that nodes chosing to do so can theorically chose to hash CIDs only one, and store the a table of correspondance between the hash and the CID to lilit CPU use at the expense of a bit of memory. The impact on memory is probably minimal as a hash is very small compared to the data represebted by the CID

@hsn10
Copy link

hsn10 commented Aug 5, 2021

Response do not need to be encrypted. Node asking DHT will need real CID to download content.

@guillaumemichel
Copy link
Contributor

Tracked in ipfs/specs#373

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants