Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle multiple advertisements on the same context ID #216

Open
gmelodie opened this issue Apr 15, 2022 · 5 comments
Open

Handle multiple advertisements on the same context ID #216

gmelodie opened this issue Apr 15, 2022 · 5 comments

Comments

@gmelodie
Copy link

The Multihash Lister that is setup when calling RegisterMultihashLister should allow for new CIDs to be appended to the advertisement chain. The way it works right now it creates a chain of advertisements (each context ID is its own chain) but this chain only allow for one advertisement to go through.

@masih
Copy link
Member

masih commented Apr 15, 2022

Currently, the MultihashLister abstraction only supports immutable contextID->multihash-list mapping. This is documented in the interface here.

The rationale is that MultihashLister abstraction is used to avoid double-storing advertised mulitihashes in the index provider engine. Instead, the lister provides a "hook" if you like to look up what multihashes were associated to a context ID and regenerates the chain of multihash entries on the fly whenever an indexer asks for them.

If the list of multihashes changes, it means so would the CID link to them, and that would break the hash verification on the receiving end. Hence the requirement for the mapping immutability.

Having said that the indexing protocol indeed supports appending multihashes to a context ID. In index-provider library as it stands today this is achievable via explicit call to Publish. I do however recommend not to mix Publish calls and NotifyPut in the same application.

It would be fantastic to dig a bit deeper into this specific use case to construct the right abstraction to satisfy it.

Cc @willscott

@gammazero
Copy link
Collaborator

The way to add more multihashes to the same context ID is to create a new advertisement that has the desired context ID, which was used in a previous advertisement, and supply the additional multihashes in the entries chain associated with the new advertisement.

The result in the indexer is the same as if all the multihashes from both advertisements were ingested as part of one advertisement with that context ID.

@gmelodie
Copy link
Author

I think the problem with this mechanism is just how practical it is (I don't think it's very practical). What I generally have is the new CIDs with addresses. It'd be nice if the publish method could just do the diff between this and the previous advertisement and do all the change/add work for us

@gammazero
Copy link
Collaborator

Previously, we had the ability to remove specific multihashes by creating a removal advertisement that had a set of multihashes with it. Such an ad would remove only the specified multihashes and not the entire context ID. We removed this capability since it was not being used and because it constrains ad processing to be done in strictly chronological order.

For your use case, it seems like this would be close to what you are looking for. You would split the delta into two parts multihashes to remove and multihashes to add, and these would be represented by two separate advertisements. Perhaps it is worth considering bringing back this capability if it solves an important problem.

@ischasny Would such a capability offer any significant advantage for indexing IPFS content?

@ischasny
Copy link
Contributor

@gammazero I think yes. At the moment, CIDs uniformly expire across all advertisements as index-provider isn't aware of the IPFS DAGs. So on busy nodes for every snapshot a lot of "contextIDs" will have to be first removed and then republished. Being able to selectively remove CIDs would certainly be beneficial.

Hopefully going forwards IPFS nodes will be able to DFS-sort CID snapshots before publishing them that should reduce the churn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants