Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: update the dht to fix yggdrasil #7186

Merged
merged 3 commits into from
Apr 21, 2020
Merged

fix: update the dht to fix yggdrasil #7186

merged 3 commits into from
Apr 21, 2020

Conversation

Stebalien
Copy link
Member

Specifically:

  1. Don't add peers with IPv6 addresses outside the 2000::/3 range to the WAN DHT. These peers are not on the public internet.
  2. Do add these peers to the LAN DHT. Really, that's the "catch all other" DHT.

Specifically:

1. Don't add peers with IPv6 addresses outside the 2000::/3 range to the WAN
DHT. These peers are not on the public internet.
2. Do add these peers to the LAN DHT. Really, that's the "catch all other" DHT.
@Stebalien
Copy link
Member Author

@willscott could you investigate this? The error looks real.

@willscott
Copy link
Contributor

the gotest failure makes sense - the wan-lan integration test is using WAN IPs in 100:: space (https://github.com/ipfs/go-ipfs/blob/master/test/integration/wan_lan_dht_test.go#L53) - i can push an update to this branch to make that be in the public space.

@Stebalien
Copy link
Member Author

Let's do that.

@willscott
Copy link
Contributor

wan-lan DHT fix pushed.

may need to also track down the failure in go-ipfs-http-client

--- FAIL: TestHttpApi/Dht/TestDhtFindProviders (1.23s)
            dht.go:104: got wrong provider:  != QmRHGMS7zps4zrXCskD8fRSti1GmnSPJNf13j6XDjy2sNK
        --- FAIL: TestHttpApi/Dht/TestDhtProvide (0.84s)
            dht.go:157: got wrong provider:  != Qma6y9fb3xdqnAEVGJY2rCrdCtxjQRTt2RYKcq3Fy21CE8

@Stebalien
Copy link
Member Author

I believe we need a sleep. I think this is happening because we're waiting until identify to add peers to our routing table.

@willscott
Copy link
Contributor

The http-client test is unfortunately over in https://github.com/ipfs/interface-go-ipfs-core/blob/master/tests/dht.go

@Stebalien
Copy link
Member Author

Which has a breaking change, which is why I haven't updated it :(.

@ipfs ipfs deleted a comment from willscott Apr 21, 2020
@Stebalien
Copy link
Member Author

Apparently, the test change still wasn't enough. I wonder if we need to sleep before adding/providing as well.

@Stebalien
Copy link
Member Author

But I'm going to merge this for now.

@Stebalien Stebalien merged commit 6269e9b into master Apr 21, 2020
@Stebalien Stebalien deleted the fix/yggdrasil branch April 21, 2020 21:06
@willscott
Copy link
Contributor

This may need to revert as the HTTP API DHT test is failing since it went in.

I added ipfs config --json Routing.Type dhtserver into the makeAPISwarm of a local instance and can still reproduce. All of the nodes in the swarm are on 127.0.0.1 - the ipv6 change shouldn't have changed dual behavior 🤔

@Stebalien
Copy link
Member Author

This was already a problem before this patch so I agree. The node putting the record could be using the WAN DHT while the other node is using the LAN DHT? Could you keep looking into this?

@willscott
Copy link
Contributor

willscott commented Apr 22, 2020

I tested against a modified ipfs binary where the dual code was modified to:

  1. always publish to the LAN as well as WAN dht
  2. remove the routing table filter from the LAN dht entirely.
    This yielded the same failures.

I'm suspecting either that the peers are not connected at the point that the DHT is getting exercised (and/or that the 127.0.0.1 addresses used in the testbed swarm are being considered invalid by something)

The call to FindProviders is hitting https://github.com/libp2p/go-libp2p-kad-dht/blob/master/query.go#L144 which is in line with the DHT being queried not having peers.

Looking at the Swarm().Peers of the nodes, they are swarm connected, and the lan protocol DHT is visible between them.

@willscott
Copy link
Contributor

It looks that like the error case in FindProviders occurs by a QueryEvent being sent over the implicit pub/sub event channel. https://github.com/libp2p/go-libp2p-core/blob/0a54f307255fc0e4ccb8569d437f0ee91a0f7a3d/routing/query.go#L95

The DualDHT probably should play some ctx games to intercept these events as another side effect of the queries. Opening an issue in kad-dht to track

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants