Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipfs blockservice is incompatible with sessions #34

Open
Jorropo opened this issue Jan 12, 2024 · 19 comments · May be fixed by #35
Open

ipfs blockservice is incompatible with sessions #34

Jorropo opened this issue Jan 12, 2024 · 19 comments · May be fixed by #35

Comments

@Jorropo
Copy link

Jorropo commented Jan 12, 2024

// Blockstore returns the underlying Blockstore.
func (nbs *BlockService) Blockstore() blockstore.Blockstore {
return nbs.bs.Blockstore()
}
// Exchange returns the underlying Exchange.
func (nbs *BlockService) Exchange() exchange.Interface {
return nbs.bs.Exchange()
}

See how the code does not apply any blocking on sessions.
Theses are used by boxo/blockservice.NewSession when a session is used.

This also cause issues with new ContextWithSession feature because it's not possible to unwrap a nopfs blockservice and access the real one underneath (so the context value key becomes the nopfs blockservice which isn't what we want because nopfs doesn't do session redirection based on context).

We should do blocking on the blockstore.Blockstore and exchange.Exchange API arguments before they are passed to the blockservice.

@hsanjuan
Copy link
Collaborator

We should do blocking on the blockstore.Blockstore and exchange.Exchange API arguments before they are passed to the blockservice.

What if it returns a blocking Exchange and blocking Blockstore when those methods are called ?

@Jorropo
Copy link
Author

Jorropo commented Jan 14, 2024

This would be incompatible with this new boxo patch:

ipfs/boxo@22fa8b1#diff-c3167a605aeb6a2f9430f5730ee6508f88e15022a746998ff6ec8ba013d0d6dc

The problem is that in the wrapping code I use the incoming blockservice as context key:
ipfs/boxo@22fa8b1#diff-c3167a605aeb6a2f9430f5730ee6508f88e15022a746998ff6ec8ba013d0d6dcR494-R497

So github.com/ipfs-shipyard/nopfs/ipfs.(*Blockservice) is used when wrapping the context.
But unwrapping is implemented in github.com/ipfs/boxo/blockservice.(*blockservice) which use it's receiver as key into the context. So the two don't match and it does not work.

github.com/ipfs/boxo/blockservice.(*blockservice).GetBlock has this check:

	block, err := blockstore.Get(ctx, c)
	switch {
	case err == nil:
		return block, nil
	case ipld.IsNotFound(err):
		break
	default:
		return nil, err
	}

Which would allows us to do filtering only in the blockstore and have it also filter the exchange thx to:

	default:
		return nil, err

I'll fix that in GetBlocks and then nopfs can implement a wrapping filtering blockstore instead of blockservice while having the same effects.

Other solutions exists but I think it's the least bad one.

@hsanjuan
Copy link
Collaborator

That sounds like relying on an implementation detail that causes exchange-blocking by chance (only because blockstore is checked first?) so I'm not very sold... better to wrap both.

I think conceptually BlockService is the right layer... it feels weird that we have to work around the fact that implementation for a BlockService session bypasses the layer and uses underlying exchange/blockstore directly. Can't we wrap sessions with blocking code directly by making NewSession part of the BlockService interface, and making sure we pass the underlying blockservice to the Blockservice.NewSession so that grabSessionFromContext works?

@Jorropo
Copy link
Author

Jorropo commented Jan 15, 2024

That sounds like relying on an implementation detail that causes exchange-blocking by chance (only because blockstore is checked first?) so I'm not very sold... better to wrap both.

Fair.

Can't we wrap sessions with blocking code directly by making NewSession part of the BlockService interface, and making sure we pass the underlying blockservice to the Blockservice.NewSession so that grabSessionFromContext works?

That would work but that require breaking the interface and I don't know if anyone else has custom blockservice implementation.

Else we could have something like:

func WithContentFilter(b func(cid.Cid) error) Option {/* ... */}

In boxo/blockservice this is already what we are doing with verifcid.

This also means nops only needs to provide 1 small function instead of many plumbing methods.

Jorropo added a commit to ipfs/boxo that referenced this issue Jan 15, 2024
Jorropo added a commit to ipfs/boxo that referenced this issue Jan 15, 2024
@hsanjuan
Copy link
Collaborator

What about a new SessionBlockstore interface that has NewSession ?

@Jorropo
Copy link
Author

Jorropo commented Jan 15, 2024

I think #35 is cleaner.
It replace ~80 slocs with:

func ToBlockServiceBlocker(blocker *nopfs.Blocker) blockservice.Blocker {
	return func(c cid.Cid) error {
		err := blocker.IsCidBlocked(c).ToError()
		if err != nil {
			logger.Warnf("blocked blocks for blockservice: (%s) %s", c, err)
		}
		return err
	}
}

While not adding as much complexity in boxo/blockservice.

What about a new SessionBlockstore interface that has NewSession ?

I don't get how this would work.
Today you can't wrap Session.
So NewSession is called, it checks if nopfs implements SessionBlockstore.
It does.
Then in nopfs.Blockservice.NewSession it call boxo/blockservice.NewSession on the underlying blockservice, at this point we have a true boxo blockservice session.
But you can't wrap this session since session isn't an interface (which might be an issue in itself). So I don't see where the filtering would happen.

@hsanjuan
Copy link
Collaborator

I think #35 is cleaner.

It is, but relies on the fact that a specific blockservice implementation exists that provides content-blocking capabilities.

But you can't wrap this session since session isn't an interface (which might be an issue in itself)

Session should be an interface (or a blockservice itself) and we wrap it... similar to merkledag.NewSession which returns an ipld.NodeGetter?

Sorry for resisting here a bit, just aiming to keep the conceptual layer model based on interfaces so that nopfs stays as purely plug-in (I think is nice that things can be like this). If we have a content-blocking BlockService, we should also think of doing the same in Namesys and Resolver and removing wrapping for general consistency.

@hacdias
Copy link
Member

hacdias commented Jan 15, 2024

@Jorropo how much work would it be to make session an interface?

I did look at (ipfs/boxo#556) and it seems to do what it says it does, but I'd like us to have a consensus here on what to do next.

That would work but that require breaking the interface and I don't know if anyone else has custom blockservice implementation.

Maybe breaking the interface isn't the worst idea if the outcome is better and cleaner. We can always provide with a changelog that explains clearly what needs to be updated in someone else's blockservices. What do you think @Jorropo @hsanjuan?

@Jorropo
Copy link
Author

Jorropo commented Jan 15, 2024

The current situation is pretty bad because it's neither a struct pointer nor an interface, it's both.

Imo we should remove the blockservice interface and put all these kinds of ad-hoc things in boxo/blockservice because we can contain cross ad-hoc feature interactions and easily solve future ones.

I'm also fine making newsession a method of blockservice and make session an interface.

We just need to decide which one of the two way forward.

@hacdias
Copy link
Member

hacdias commented Jan 15, 2024

Imo we should remove the blockservice interface and put all these kinds of ad-hoc things in boxo/blockservice because we can contain cross ad-hoc feature interactions and easily solve future ones.

I need a more concrete example to understand what you mean by that 😅

@Jorropo
Copy link
Author

Jorropo commented Jan 15, 2024

I'll submit both PR then we can choose, aint much work.

@Jorropo
Copy link
Author

Jorropo commented Jan 15, 2024

While implementing the no interface solution I found yet an other bug.
The IsCached call from the gateway bypass the blockservice and thus verifcid and nopfs. It's probably fine since it doesn't send content, just a boolean state saying if we have or not the block, but I think this shows why layerying more and more layers when the scope is very wide and loosely defined leads to bug.

Jorropo added a commit to ipfs/boxo that referenced this issue Jan 15, 2024
…r` option

The idea is to have a THE blockservice object, this means we wont ever have an issue where multiple competitive features play poorly with each other due to how nested blockservices could do.

Let's say we add multi-exchange based on content routing, do we want to have to handle the mess this could create with nested blockservices ?

It implements features for ipfs-shipyard/nopfs#34.
@Jorropo
Copy link
Author

Jorropo commented Jan 15, 2024

Jorropo added a commit to ipfs/boxo that referenced this issue Jan 15, 2024
Jorropo added a commit to ipfs/boxo that referenced this issue Jan 15, 2024
…r` option

The idea is to have a THE blockservice object, this means we wont ever have an issue where multiple competitive features play poorly with each other due to how nested blockservices could do.

Let's say we add multi-exchange based on content routing, do we want to have to handle the mess this could create with nested blockservices ?

It implements features for ipfs-shipyard/nopfs#34.
@hsanjuan
Copy link
Collaborator

Thanks! please give me a couple of days to look into the proposals.

@hsanjuan
Copy link
Collaborator

In principle I'm ok with #563 in that it is aligned with how we normally do things... (granted this might not be the best way of doing things all the time, but seems straightforward at least).

@Jorropo
Copy link
Author

Jorropo commented Jan 15, 2024

It's how we usually do things indeed, however I think it might be too complex for it's own good.
ipfs/boxo#561 is simpler when considering the total system.

The tricky thing is that blockservice.Blockservice is really a session factory when you think about it. And we don't have factories usually.

@hsanjuan
Copy link
Collaborator

What is the motivation for embedding sessions in the context? I may have missed it but I didn't see what prompted that change in the first place.

@Jorropo
Copy link
Author

Jorropo commented Jan 15, 2024

@hsanjuan this is a thing a couple of peoples including me had in mind. ipfs/boxo#94 is the oldest written trace I know about.

We are quite bad at passing around sessions everywhere.
For example boxo/gateway need ~15 sessions to resolve the path from dist.ipfs.tech and download 25MB kubo .zip file, even tho we should only use one.
This leads to extra costs and latency because each new session start broadcasting to every peer from scratch.

This constant session reset prevent us to do other optimizations (like setting the initial provider delay on bitswap to 0s) because this generate DHT load at each reset.

The main reason we are bad at passing sessions outside of context is that people usually have a blockgetter argument in their New... function, they instantiate some children services they need (like fetcher, dagservice, ...) and store that in the struct. Then when handling requests they call the services they need.

However to do that properly you need to do initialization per request, not per application instance, which is fine, it's just way more work to update that in all the data stack of kubo and rainbow.

That also mean we can't use fx for anything between blockservice and final consumers (or we would need an fx instance per request).

Passing the session in the context is cheap, it works, and I don't think it makes the code worst (incentivize to pass everything in the context at the cost of compiletime safety) since the unwrapping only happen on existing blockservice object, so you still need to pass the blockservice which let you analyze and follow the code statically.

Jorropo added a commit to ipfs/boxo that referenced this issue Jan 15, 2024
…r` option

The idea is to have a THE blockservice object, this means we wont ever have an issue where multiple competitive features play poorly with each other due to how nested blockservices could do.

Let's say we add multi-exchange based on content routing, do we want to have to handle the mess this could create with nested blockservices ?

It implements features for ipfs-shipyard/nopfs#34.
@hsanjuan
Copy link
Collaborator

I understand. Honestly, now that mostly everything is in boxo, it might be worth to streamline session handling one day. Passing things in context is a hack so that we don't pass them as explicit arguments after all... But as you say, it is a cheap approach and solves the issue for a low price, compared to the non-trivial re-wiring otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants