Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Extract provider logic from BitSwap #4333

Closed
wants to merge 4 commits into from

Conversation

magik6k
Copy link
Member

@magik6k magik6k commented Oct 21, 2017

<WIP>
Fixes #4311

This is not even remotely close to done Getting there, PR mainly for comments / tests / progress tracking

@ghost ghost assigned magik6k Oct 21, 2017
@ghost ghost added the status/in-progress In progress label Oct 21, 2017
@magik6k magik6k force-pushed the feat/provider-strategies branch 6 times, most recently from 561d784 to d571f6b Compare October 24, 2017 14:41
// If checkFirst is true then first check that a block doesn't
// already exist to avoid republishing the block on the exchange.
checkFirst bool
}

// NewBlockService creates a BlockService with given datastore instance.
func New(bs blockstore.Blockstore, rem exchange.Interface) BlockService {
func New(bs blockstore.Blockstore, rem exchange.Interface, p providers.Interface) BlockService {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The blockservice should not have to know anything about the providers system.

@@ -167,7 +176,7 @@ var bitswapStatCmd = &cmds.Command{
}
buf := new(bytes.Buffer)
fmt.Fprintln(buf, "bitswap status")
fmt.Fprintf(buf, "\tprovides buffer: %d / %d\n", out.ProvideBufLen, bitswap.HasBlockBufferSize)
fmt.Fprintf(buf, "\tprovides buffer: %d / %d\n", out.ProvideBufLen, provider.HasBlockBufferSize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

future note: we might want to have a separate command for the providers subsystem, and remove this information from here (since its no longer relevant to bitswap itself).

sizeBatchRequestChan = 32
// kMaxPriority is the max priority as defined by the bitswap protocol
kMaxPriority = math.MaxInt32

self = peer.ID("")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh?

@@ -72,3 +73,19 @@ func (e *offlineExchange) GetBlocks(ctx context.Context, ks []*cid.Cid) (<-chan
func (e *offlineExchange) IsOnline() bool {
return false
}

type offlineProviders struct{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get away with not even having this at all

@@ -0,0 +1,87 @@
package providers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should probably not live in the exchange subdirectory, maybe make a new top level directory (ideally, we can move it out of the go-ipfs repo, but that can wait)

@magik6k magik6k force-pushed the feat/provider-strategies branch 2 times, most recently from f57e691 to 9f3ff87 Compare October 25, 2017 18:05
@magik6k magik6k force-pushed the feat/provider-strategies branch 5 times, most recently from 63e7a42 to 52925ad Compare November 22, 2017 00:37
@magik6k
Copy link
Member Author

magik6k commented Nov 22, 2017

This PR get's quite noisy with the providers interface propagation, I'm going to defer the strategies to the next PR while this will focus on extracting provider logic from bitswap.

@magik6k magik6k changed the title [WIP] Provider strategies when adding blocks [WIP] Extract provider logic from BitSwap Nov 22, 2017
path/resolver.go Outdated

ResolveOnce func(ctx context.Context, ds dag.DAGService, nd node.Node, names []string) (*node.Link, []string, error)
ResolveOnce func(ctx context.Context, ds dag.DAGService, prov providers.Interface, nd node.Node, names []string) (*node.Link, []string, error)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/ipfs/go-ipfs/pull/4333/files#diff-cb03f3339782273ac5936e321903f907R16 needs this, though it's probably safe to not pass the interface here and use offline.Providers there.

@@ -174,7 +178,7 @@ func (bsnet *impl) FindProvidersAsync(ctx context.Context, k *cid.Cid, max int)

// Provide provides the key to the network
func (bsnet *impl) Provide(ctx context.Context, k *cid.Cid) error {
return bsnet.routing.Provide(ctx, k, true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this call here actually send out a message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably does, I'll try to write a test for that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to move away from that now. The equivalent of setting the bsnet.routing.Provide final parameter to 'false'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, though it may be worth making this a configurable thing.

It would be best to have the strategies apply here, though it would require a lot of refactoring to make it fit into the code base nicely.

@@ -20,7 +20,7 @@ type Stat struct {

func (bs *Bitswap) Stat() (*Stat, error) {
st := new(Stat)
st.ProvideBufLen = len(bs.newBlocks)
st.ProvideBufLen = -1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its probably fine to remove this entirely

@magik6k magik6k force-pushed the feat/provider-strategies branch 2 times, most recently from 630ceb2 to dc9ed4d Compare November 30, 2017 12:59
Copy link
Member

@frrist frrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than my 2 nits - LGTM

@@ -509,6 +509,13 @@ Available templates:
res.SetError(err, cmdkit.ErrNormal)
return
}

err = n.Providers.Provide(k)
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you have done in other files, these error checks should be condensed to a single line. This will keep err in the scope of the if statement.

@@ -110,7 +110,11 @@ func (n *UnixfsNode) AddChild(child *UnixfsNode, db *DagBuilderHelper) error {
return err
}

_, err = db.batch.Add(childnode)
_, err = db.batch.Add(childnode) //TODO: do we need to provide here?
err = db.prov.Provide(childnode.Cid())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor - We should be consistent with how we pass the values returned from Add to Provide. Here we are ignoring the returned cid since it is available on the childnode. Elsewhere we use a separate variable to store the cid when calling Provide. I am not sure which way is more appropriate, what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use node.Cid(), Add won't be returning cid after refactor in ipfs/go-ipld-format#8 happens, good catch

@magik6k
Copy link
Member Author

magik6k commented Mar 26, 2018

Squashed most of the commits to make it less painful to resolve conflicts.

I think it's ready for a review. 2 things to note:

  • The point of this PR is to extract providing logic from bitswap, it shouldn't change what gets provided
  • ipfs add will do providing after adding stuff, this can take a while and could use a separate progress bar, but it's quite likely to be removed when provider strategies get implemented. I think that this ux can be somewhat improved by looking if we run add on a daemon and do the providing background.

@magik6k magik6k added the need/review Needs a review label Mar 26, 2018
Provide(k *cid.Cid) error
ProvideRecursive(ctx context.Context, n ipld.Node, serv ipld.NodeGetter) error

FindProviders(ctx context.Context, k *cid.Cid) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this method exists so we can connect to providers and then automatically send them our wantlists? Really, we want to stop doing that. We'd rather send our wantlist to a select set of peers.

  1. If we do keep this, let's call it "ConnectToProviders"
  2. I'd rather not keep it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also does dht querying. I guess what you mean that instead of this method bitswap should use FindProvidersAsync and only send wantlists to those peers?

For now, I'd rename the method, and remove it as a part of other PR

}

// Interface defines providers interface to libp2p routing system
type Interface interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interface?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its a pretty common go thing. ends up being used as providers.Interface. see: https://golang.org/pkg/sort/#Interface

// Interface defines providers interface to libp2p routing system
type Interface interface {
Provide(k *cid.Cid) error
ProvideRecursive(ctx context.Context, n ipld.Node, serv ipld.NodeGetter) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should document if these are async or synchronous.

Ctx context.Context
}

// Interface defines providers interface to libp2p routing system
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand on this?

return nil
}

func (p *providers) provideRecursive(ctx context.Context, n ipld.Node, serv ipld.NodeGetter, done *cid.Set) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have to be careful not to pass an online dag service in here.

@Stebalien
Copy link
Member

What is the end goal here? Is the end goal to have the provider service remember which blocks we should be providing? That is, are we planning on supporting options like ipfs add --provide=[root|files|...]?

As implemented, this patch forces the programmer to remember to call the provider system every time they add a block they want to provide to the network. Unless we're going to allow the programmer to make provider decisions at that point, this feels like a step backward. If this is the case, I'd rather add some form of EventedBlockstore interface and have a provider service subscribe to events from this service.

On the other hand, providing a way for users to specify how/when they want to provide blocks is useful and could further be used for access control.

@whyrusleeping
Copy link
Member

@Stebalien

That is, are we planning on supporting options like ipfs add --provide=[root|files|...]?

Yes, we're planning on adding that.

Here's an excerpt from another doc i'm writing:

Content Routing, (most importantly the providing side of things) should not be inside bitswap, and bitswap should not ever block on some external service (like the DHT) being slow. This is currently a problem as for every block bitswap receives, it pushes that block to the provider service. The provider service is slow, and therefore must either buffer infinitely, or apply backpressure. That backpressure is significantly slowing down bitswap at the moment. There is already a PR open by @magik6k that separates these pieces out, but does not change the behavior. The next step after that PR is merged is to change the way we reason about providing. Every block added through the blockservice should not cause a separate providing event. Higher layers should inform the provider subsystem that they wish to provide a given graph, in a certain way. Then it is up to the provider subsystem to traverse the graph as needed and send out the appropriate provider messages. This way, the provider subsystem does not get one independent call for every node in the graph (requiring O(n) memory to buffer and track), it gets a single call for the root of the graph. This will help us avoid needing to apply backpressure on bitswap to avoid using too much memory there.

@whyrusleeping
Copy link
Member

In general, having 'write data to the local store' also mean 'broadcast the fact that we have that piece of data' is pretty bad, both in terms of privacy and performance. 'forcing' the programmer to remember to call Provide is the right thing to do IMO.

@Stebalien
Copy link
Member

In general, having 'write data to the local store' also mean 'broadcast the fact that we have that piece of data' is pretty bad, both in terms of privacy and performance. 'forcing' the programmer to remember to call Provide is the right thing to do IMO.

As long as we bring the reprovider in line with this. I just don't want to end up in an inconsistent situation where users don't explicitly provide but then we end up providing in the background anyways.

@whyrusleeping
Copy link
Member

Yeah, the reprovider gets folded into the 'provider' as part of this

License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
@@ -560,6 +565,11 @@ func (i *gatewayHandler) deleteHandler(w http.ResponseWriter, r *http.Request) {
return
}

if err := i.node.Providers.Provide(c); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@magik6k I have a question: why c here? It seems like this will try to provide the same Cid for each iteration of the loop. Is that intentional? If so, I'm curious why. I'm using this PR to inform a similar change I'm making and am caught up on this detail. Thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is wrong. This should be either newnode.Cid() or more likely not at all here (Provide below should handle this, not 100% sure)

@Stebalien
Copy link
Member

@magik6k given the work in #5774, should we close this?

@magik6k magik6k closed this Dec 18, 2018
@ghost ghost removed the status/in-progress In progress label Dec 18, 2018
@magik6k
Copy link
Member Author

magik6k commented Dec 18, 2018

Yep, don't delete the branch yet though

@hacdias hacdias deleted the feat/provider-strategies branch May 9, 2023 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/review Needs a review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

respect provider strategies on initial adds
5 participants