Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added subscription specific configurations to PubSub #171

Closed
wants to merge 3 commits into from

Conversation

aschmahmann
Copy link
Contributor

There's still work to be done here, both in documentation and some design work. It'd be great to get some feedback on the direction this is going though.


Overall the additions are:

  • Added the ability to have subscription based options
  • Added some configurable parameters to gossipsub
  • Added a gossipsub last-writer-wins-style persistence configuration

Added simple configurability to gossipsub
Added a gossipsub last-writer-wins persistence configuration
@raulk
Copy link
Member

raulk commented Apr 8, 2019

Hey @aschmahmann, thanks for the contribution! 🎉 Could you link to an issue where the need and rationale behind this was discussed? Thanks!

Copy link
Collaborator

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some very passing comments from first reading.
What's the rationale for these changes?

gossipsub.go Outdated Show resolved Hide resolved
gossipsub.go Outdated Show resolved Hide resolved
gossipsub.go Outdated Show resolved Hide resolved
gossipsub.go Outdated
for _, topic := range msg.GetTopicIDs() {
config, ok := gs.topicConfigs[topic]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's just backwards; can't have a config object handling hte most important function of the router!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that the whole premise of every inversion of control system? We could create new routers and pubsub instances for every configuration option instead of using one pubsub instance.

However, that means

  1. having multiple heartbeat tickers running at the same time
  2. not reusing any of the gossipsub structures across implementations
  3. ending up with weird multiple tenancy issues unless we change PubSub more. For example, if right now you create a FloodSub and GossipSub instance on the same node you end up with overly excessive message passing because GossipSub is backwards compatible with FloodSub and will handle both messages. Either we need one router to handle multiple configurations or we need to make PubSub able to handle backwards compatible protocols, this way involves less protocol work.

gossipsubimpl.go Outdated Show resolved Hide resolved
cancelCh chan<- *Subscription
err error
topic string
topicProtocol protocol.ID
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this belongs here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree, there is in PubSub a declaration type SubOpt func(sub *Subscription) error that is currently almost unusable because Subscription has no fields to work with.

I assume that pattern taken with both Option (for PubSub) and SubOpt (for Subscriptions) is designed to allow adding fields to the struct, and functionality as option wrappers, without breaking the struct contract/interface. Is that incorrect, if so how is SubOpt supposed to be used and where would you put the relevant topic/Subscription information?

Copy link
Contributor Author

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main rationale for having a persistent pubsub is at #42. I also briefly give some reasons for needing this at #161.

gossipsub.go Outdated Show resolved Hide resolved
gossipsub_test.go Outdated Show resolved Hide resolved
gossipsubimpl.go Outdated Show resolved Hide resolved
pubsub.go Outdated Show resolved Hide resolved
pubsub.go Outdated Show resolved Hide resolved
gossipsub.go Outdated
for p := range tmap {
if gs.peers[p] == FloodSubID {
tosend[p] = struct{}{}
if withFloodSubPeers{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The various configurations that I've implemented so far could likely all be done in floodsub, episub, etc. Two caveats are 1) we'd need to allow the layer above the router implementations to have more access to the internals (e.g. which peers I'm syncing/sending messages to for a given topic, some control over limiting or manually triggering a broadcast) 2) The existing less performant implementations (e.g. floodsub) would do extremely poorly for some of these configuration options (e.g. persistence) anyway.

Additionally, in future implementations of say episub the configs used by gossipsub should be reusable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we're ok breaking the router implementations/interfaces a bit do you think I should try and setup some of the configurations (e.g. persistence) to work across all router implementations or is leaving the configurations as something to be plugged into future routers (e.g. episub) sufficient?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fine, although I don't know how useful it will be for floodsub.

gossipconfig.go Outdated
Put(msg *pb.Message)
}

type ClassicGossipSubConfiguration struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up for a better name if you've got one

@vyzo
Copy link
Collaborator

vyzo commented Apr 8, 2019

Just a note: you are changing very carefully thought-out code, don't go doing stuff willy-nilly!

@aschmahmann
Copy link
Contributor Author

Just a note: you are changing very carefully thought-out code, don't go doing stuff willy-nilly!

Nothing here has been done "willy-nilly". Everything I've done so far has been targeted at making sure "user space" around PubSub and gossipsub has been minimally affected (which is true for anyone using the GossipSub, FloodSub, ... constructors), while at the same time allowing for changes that are needed by consumers of PubSub (e.g. IPNS).

I see from some of your comments that we're actually ok with some breaking changes within the PubSub system which should certainly help clean up the code.

I respect that a lot of thought went into the PubSub system and is part of why I'd like to enable users to tinker with it from user space as much as possible so they don't end up trying to copy-paste or fork pubsub. I think it'd be great to have some document discussing how we'd like to enable people to extend PubSub and also consider whether the configuration options I'm putting in are sufficient or just sufficient for the cases that I'm currently considering.

@raulk
Copy link
Member

raulk commented Apr 9, 2019

@aschmahmann could you open an RFC discussion in libp2p/specs? Please outline context, motivation, rationale, goals, etc. Then link to this PR as a sample implementation. Inferring all that from various issues is less than ideal and is subject to bias and interpretation. Also, not all libp2p implementers watch all go repos, so we elevate protocol discussions to the specs repo. libp2p is a specs-driven project so any changes introduced here would have to be preambled by changes in specs.

@aschmahmann
Copy link
Contributor Author

@raulk There are 3 potential specs I could add, and would like some advice on which ones to go with. There's: 1) PubSub patch 2) GossipSub patch 3) Something about Last-Writer-Wins persistence

  1. There doesn't appear to be anywhere in the PubSub spec to add anything interesting from my work, although I'm up for any suggestions. I haven't touched any wire protocols and there's no mention in the spec on the Join function or on any of the options in the go implementation (e.g. Blacklist).
  2. I can patch the GossipSub spec to allow for Joining a topic with a particular protocol ID and forwarding certain functions to the particular configurations. However, I need confirmation from @vyzo that the inversion of control mechanism I'd like to use is ok. While IoC is an implementation detail it affects whether the change occurs at the PubSub or GossipSub layers. Given that there's nowhere in the PubSub spec for this to go it seems to imply that either the PubSub spec is missing info or the GossipSub spec has too much implementation specific info.
  3. Happy to write this up, although I need to do a little thinking first because since @vyzo suggested it was possible for me to break the PubSubRouter interfaces it might be useful to implement LWW at the PubSub level instead of the GossipSub level. Note that if it's added at the PubSub level it's unclear where to put the spec changes, as described above.

@raulk
Copy link
Member

raulk commented Apr 9, 2019

@aschmahmann I would start with an issue in libp2p/specs theorising context (what is the current situation), description of change (what you want to change), motivation/rationale (why you want to change it), proposed design (how you want to change it), proposed next steps.

Then the community can gauge if this is something of interest across implementations, and can advice how to best integrate it.

@aschmahmann
Copy link
Contributor Author

@raulk @vyzo, I published a high level overview of why and what I want to change at the issue above (#175). As you'll notice there's limited spec change proposals (although if my LWW implementation is going to be part of libp2p it may want to get its own spec proposal at some point).

…d be re-emitted to (i.e. not initial message propagation, but rebroadcasting messages)

Moved configuration changes to gossipsub into relevant interfaces instead of Options (makes code much cleaner at the expense of breaking the interfaces)
@aschmahmann
Copy link
Contributor Author

@vyzo while PubSub's Subscribe function takes in Options (and as a result can be setup for configuration) the Publish function does not. I was thinking of adding a PublishWithOptions function to accommodate this without breaking anyone who is using a default Publish currently. What do you think?

@vyzo
Copy link
Collaborator

vyzo commented Apr 23, 2019

what options do you want to pass to Publish?

@aschmahmann
Copy link
Contributor Author

For now likely just WithProtocol(proto protocol.ID) as I am with Subscribe

@vyzo
Copy link
Collaborator

vyzo commented Apr 23, 2019

that won't work, the protocol ID is a property of the stream.

@aschmahmann
Copy link
Contributor Author

Yes, it is a property of the stream. In Master currently the way that you would setup a routing system with a different configuration (e.g. adding persistence) is by creating a new PubSubRouter with a new protocol ID.

The implementation here is similar except that instead of having one PubSub instance per router I am effectively putting multiple routers into a single PubSub instance and associating each topic with a particular router. If you look at what I'm doing with Subscribe and Join it's effectively this. However, because you can Publish to a topic without first subscribing I should be able to set the router that a topic is using from the Publish (or PublishWithOptions) function as well.

gossipsub.go Outdated
if !ok || ok {
gs.pushGossip(p, &pb.ControlIHave{TopicID: &topic, MessageIDs: mids})
}
emitPeers := config.GetEmitPeers(gs.wrapGetPeers(topic), peers)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vyzo adding an EmittingStrategy to the configuration was one way to allow developers to choose which peers messages will get rebroadcasted to.

@vyzo
Copy link
Collaborator

vyzo commented Apr 24, 2019

I am effectively putting multiple routers into a single PubSub instance and associating each topic with a particular router.

You are killing gossip piggybacking with this most likely.

@vyzo
Copy link
Collaborator

vyzo commented Apr 24, 2019

Also, can we call Config something else? It's not an apt name when it encompasses all the fundamental actions of the protocol. Perhaps call it Strategy?

@aschmahmann
Copy link
Contributor Author

You are killing gossip piggybacking with this most likely.

I'm actually think I'm doing the opposite. If we had to make a new PubSub instance for every configuration then we'd be stuck with multiple heartbeats each sending out there own piggybacked data which means that if A and B were connected by two routers then we'd end up with two messages instead of one message with both the data from A and B piggybacked onto it.

@aschmahmann
Copy link
Contributor Author

aschmahmann commented Apr 24, 2019

Also, can we call Config something else? It's not an apt name when it encompasses all the fundamental actions of the protocol. Perhaps call it Strategy?

I don't think it encompasses all the fundamental actions of the protocol (I'd argue that graft + prune is the main component of gossipsub). However, as I don't have strong feelings about the naming here, I'm fine with Strategy or something more specific like XStrategy where X is a little more informative (especially since I've introduced an EmitterStrategy already).

}

func (gs *GossipSubRouter) AddGossipPeers(tosend map[peer.ID]struct{}, topics []string, withFloodSubPeers bool) {
for _, topic := range topics {
// any peers in the topic?
tmap, ok := gs.p.topics[topic]
if !ok {
continue
Copy link
Contributor Author

@aschmahmann aschmahmann Apr 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vyzo is it intentional that during a Publish we do not initialize gs.fanout[topic] if there are no peers in the topic (and we're not subscribed/there's no gs.mesh[topic])? By not doing this we ensure that we will not rebroadcast (via emitGossip) our published message even once we come across new peers. This is less relevant in the current implementation since the republishing window is relatively short (5 heartbeat intervals which is 500ms), however it might still be worth correcting this even for our existing implementation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They won't be added to fanout unless there is a new message published.
Also, the heartbeat is 1 second, so 5 intervals are 5s.

Copy link
Contributor Author

@aschmahmann aschmahmann Apr 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad on the interval timing, was looking at heartbeat initial delay by accident. Ok, so the following sequence of events in the current implementation is intentional?

Scenario 1:

  1. A publishes M to GossipSub topic T, but has no peers
  2. B subscribes to T
  3. B connects to A
  4. Nothing happens, B never receives M (even if within the cache history period)

Scenario 2:

  1. A publishes M to GossipSub topic T, but has no peers
  2. B1...B20 subscribes to T (important that it's more than GossipSubD=6, as mentioned above)
  3. B1...B20 connects to A
  4. Some of the B's receive M and some do not

Scenario 3:

  1. A publishes M to GossipSub topic T, but has no peers
  2. Wait 6 seconds
  3. B1...B20 subscribes to T
  4. B1...B20 connects to A
  5. Nothing happens, none of the B's receives M

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's fine.

Copy link
Contributor Author

@aschmahmann aschmahmann Apr 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That psudo-use of a history buffer seems pretty suspect to me, but that's fine I can preserve that behavior and just modify the persistence version.

To clarify, I don't really see what's gained by having B not receive M in scenario 1.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not designed to be a reliability mechanism; it's designed to help the network connect when the mesh is not fully connected for one reason or the other.

Renamed configurations to strategies
Added OnGraft function to GossipStrategies
LWW strategy now requests the latest topic data during OnGraft
@vyzo
Copy link
Collaborator

vyzo commented Mar 23, 2020

Closing as this is stale.

@vyzo vyzo closed this Mar 23, 2020
@libp2p libp2p deleted a comment from bitcard Feb 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants