Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dht mode toggling (modulo dynamic switching) #350

Merged
merged 10 commits into from
Jun 26, 2019
Merged

Conversation

whyrusleeping
Copy link
Contributor

closes #349

First stab at this, trying to make it as minimally invasive as possible.

Please suggest tests to write (or please help me test).

I think we can avoid the extra message type i mentioned in the issue, and rely on protocol negotiation to handle getting peers to drop us from their routing tables. Old peers will be a little messed up by this, but they already have such a hard time dialing and connecting anyways, i don't think anyone will notice a difference

Copy link
Contributor

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good for the most part.

dht.go Outdated Show resolved Hide resolved
Copy link
Member

@raulk raulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shaping up well.

dht.go Outdated
@@ -41,6 +41,11 @@ var logger = logging.Logger("dht")
// collect members of the routing table.
const NumBootstrapQueries = 5

const (
ModeServer = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use a custom type definition and iota here.

dht.go Show resolved Hide resolved

// hacky... also closes both inbound and outbound streams
for _, c := range dht.host.Network().Conns() {
for _, s := range c.GetStreams() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: backwards-compatibility. This is where the tricky part comes in. This drops streams for peers in our routing table. It will not invite the peer to drop us from theirs. So they’ll keep querying us, and we’ll “na” all their negotiations. We’ll basically become an unhelpful peer taking up a slot in their table, unless we disconnect to trigger the removal.

On a related note, there seems to be a race between identify and the DHT notifee. Even if we disconnect and reconnect, if the DHT notifee runs before identify finishes, we might be deciding on stale protocols: https://github.com/libp2p/go-libp2p-kad-dht/blob/master/notif.go#L27

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah... the backwards compatibility bit kinda sucks. I don't know that a nice backwards compatible solution exists aside from hard disconnecting from those peers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much rather track streams manually.

dht.go Outdated Show resolved Hide resolved
bigs
bigs previously approved these changes Jun 13, 2019
dht.go Outdated Show resolved Hide resolved
@whyrusleeping
Copy link
Contributor Author

Alright, I'm only closing inbound dht streams now, and i'm resetting them instead of just closing. Anyone have ideas for good tests to write to test this out?

@bigs
Copy link
Contributor

bigs commented Jun 14, 2019

@whyrusleeping maybe just use mocknet to make two hosts and check the behavior of IdentifyService before and after the toggle for the most basic tests.

@raulk
Copy link
Member

raulk commented Jun 17, 2019

@bigs feel like taking a stab at writing those tests?


@whyrusleeping:

they already have such a hard time dialing and connecting anyways

With the relay infrastructure, the network is, in theory, very well connected.


We need to implement the logic that's going to dynamically switch from one to another. Without it, this PR is equivalent to the existing DHTClientOption option.

@whyrusleeping
Copy link
Contributor Author

@raulk exactly. The point of this is just giving some other code the ability to switch between the two.

The next step is to start all ipfs nodes in client mode, and only switch to server mode if autonat detects a publicly dialable peer

@raulk
Copy link
Member

raulk commented Jun 17, 2019

@whyrusleeping do you want to implement the controller in IPFS (since I believe it has access to autonat), or should we do it in the DHT via one of the options here: #349 (comment), and then switch to the event bus when it’s ready?

@whyrusleeping
Copy link
Contributor Author

whyrusleeping commented Jun 17, 2019 via email

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to wire this through to identify:

  1. We need to know when the peer starts speaking the DHT protocol.
  2. We need to know when the peer stops speaking the DHT protocol.

(mostly 1).


// hacky... also closes both inbound and outbound streams
for _, c := range dht.host.Network().Conns() {
for _, s := range c.GetStreams() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much rather track streams manually.

return nil
}

func (dht *IpfsDHT) moveToClientMode() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fine but I'd feel safer if we checked if we were in client mode in the stream handler, just in case.

@whyrusleeping
Copy link
Contributor Author

Alright, current plan of attack:

  • Give DHT ability to listen to changes in identify push
    • When we hear about our peers adding DHT protocol handlers, put them into our routing table
    • When we head about our peers removing DHT protocol handlers, remove them from our routing table

Old peers will be slow to add new peers to their routing tables, as we currently rely on the protocol probe up front for detecting if peers are in client mode. If older peers are informed about new peers in server mode who they previously rejected as a client, they will attempt to open a new stream to that peer, so eventually this will propagate around.

New peers will include old peers in their routing table if they start in server mode.


We thought about adding an extra DHT message to signal whether or not we are in server mode or client mode, but the problem there is that it requires you to keep the listeners on. I'm not actually sure this is that bad of a problem.

@bigs
Copy link
Contributor

bigs commented Jun 19, 2019

added an extremely basic test demonstrating the change in behavior from client -> server -> client

@whyrusleeping
Copy link
Contributor Author

Thanks @bigs !

I pushed changes that add handling for identify push protocol changes from other peers.

The next (And hopefully final) step is to decide if we want to version bump the dht. it will be pretty painful, but should make the resulting DHT muuuuch better.

@whyrusleeping
Copy link
Contributor Author

How this could pan out:

  • we add a dht/1.1.0 protocol version
  • we continue listening on dht/1.0.0
  • new peers only add peers supporting 1.1.0 to their routing tables
  • old peers will still add new peers to their routing table
  • queries by new peers will be limited to the 'new' subset of the DHT, and at first will likely fail to resolve many requests
    • however, they should fail much faster than on the old dht
  • old nodes should not notice much degredation, as the dht they are part of consists of all nodes (though new nodes will only route new nodes, so if the 'multipath kademlia routing' is implemented correctly, all should be fine).

I think this is a good idea, and can implement it very easily. thoughts? @raulk @vyzo @Stebalien @Kubuxu

dht.go Outdated

if !cfg.Client {
dht.mode = ModeServer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just call modeToServer?

@Stebalien
Copy link
Member

Stebalien commented Jun 20, 2019

Do it. Peer routing should still work as long as the old node is connected to at least one new node.

Note: We don't want to add support for the new protocol and use both. If we do that, the dht's will "join" and turning off the old protocol won't break them (until everyone upgrades).

dht.go Outdated
ch := make(chan event.EvtPeerProtocolsUpdated, 8)
cancel, err := dht.host.EventBus().Subscribe(ch)
if err != nil {
return nil, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this compile?

dht.go Outdated

if add && drop {
// TODO: discuss how to handle this case
log.Warning("peer adding and dropping dht protocols? odd")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way this is implemented, in practice it's very, very, very unlikely that the event will contain (a) multiple protocols, and/or (b) multiple operations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, but its possible, so what do we do when it does?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is tricky. the data model allows it to happen, but the implementation doesn't allow it to happen. maybe we should revisit these event structs to eliminate the * cardinality.

dht_net.go Show resolved Hide resolved
dht_net.go Outdated Show resolved Hide resolved
@raulk raulk dismissed bigs’s stale review June 20, 2019 16:24

This is WIP and there are still items being discussed. Removing this review to avoid confusion.

@raulk
Copy link
Member

raulk commented Jun 20, 2019

@whyrusleeping that sounds fair, and I arrive to the same conclusion re: the tradeoff. But I'd like us to evaluate these alternate scenarios:

  1. we preserve the protocol, therefore peers have no way to signal they support dynamic switching.
  2. we preserve the protocol, and peers signal to us if they support dynamic switching via a bitflag on the DHT protobufs.
  3. we change the protocol but add legacy peers to our routing table unless our connection to them is via a relay.

whyrusleeping and others added 2 commits June 21, 2019 13:51
Co-Authored-By: Raúl Kripalani <raul.kripalani@gmail.com>
@whyrusleeping
Copy link
Contributor Author

@raulk i moved the routing table segregating code to a different PR #356 I think all this here is ready to go once the deps are right


if add && drop {
// TODO: discuss how to handle this case
logger.Warning("peer adding and dropping dht protocols? odd")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to figure out what to do here, @Kubuxu and @raulk any thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal: "Peer is bad, we don't want them in our routing table"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can either noop (since both operations cancel each other out), or discard the peer altogether.

@whyrusleeping
Copy link
Contributor Author

Alright, ready for some review. @raulk @Kubuxu

@raulk raulk changed the base branch from master to stabilize June 26, 2019 16:45
@raulk raulk changed the title WIP: dht mode toggling dht mode toggling (modulo dynamic switching) Jun 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants