Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrapping IPFS takes over 1 minute on a fresh repository #6658

Closed
meyer9 opened this issue Sep 22, 2019 · 2 comments
Closed

Bootstrapping IPFS takes over 1 minute on a fresh repository #6658

meyer9 opened this issue Sep 22, 2019 · 2 comments
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@meyer9
Copy link

meyer9 commented Sep 22, 2019

Version information:

go-ipfs v0.4.22

Description:

First, I clear the IPFS repo by running: rm -rf ~/.ipfs and ipfs init.

I wrote the following script to run a node and retrieve a file with a 60 second context deadline:

package main

import (
	"context"
	"fmt"
	"io/ioutil"
	"log"
	"path/filepath"
	"time"

	files "github.com/ipfs/go-ipfs-files"
	"github.com/ipfs/go-ipfs/core"
	"github.com/ipfs/go-ipfs/core/coreapi"
	"github.com/ipfs/go-ipfs/plugin/loader"
	"github.com/ipfs/go-ipfs/repo/fsrepo"
	coreiface "github.com/ipfs/interface-go-ipfs-core"
	"github.com/ipfs/interface-go-ipfs-core/path"
)

func retrieveFile(ctx context.Context, api coreiface.CoreAPI, p path.Path) ([]byte, error) {
	fileNode, err := api.Unixfs().Get(ctx, p)
	if err != nil {
		return nil, err
	}

	if file, success := fileNode.(files.File); success {
		return ioutil.ReadAll(file)
	}

	return nil, fmt.Errorf("hash %s is a directory", p.String())
}

func main() {

	ctx := context.Background()

	plugins, err := loader.NewPluginLoader(filepath.Join("~/.ipfs/plugins"))
	if err != nil {
		log.Fatalf("error loading plugins: %s", err)
	}

	if err := plugins.Initialize(); err != nil {
		log.Fatalf("error initializing plugins: %s", err)
	}

	if err := plugins.Inject(); err != nil {
		log.Fatalf("error initializing plugins: %s", err)
	}

	// Basic ipfsnode setup
	r, err := fsrepo.Open("~/.ipfs")
	if err != nil {
		log.Fatalf("error initializing fsrepo: %s", err)
	}

	cfg := &core.BuildCfg{
		Repo:   r,
		Online: true,
	}

	nd, err := core.NewNode(ctx, cfg)

	if err != nil {
		log.Fatalf("error initializing new node: %s", err)
	}

	api, err := coreapi.NewCoreAPI(nd)
	if err != nil {
		log.Fatalf("error initializing core api: %s", err)
	}

	start := time.Now()

	retrieveCtx, _ := context.WithTimeout(ctx, time.Second*60)
	log.Printf("requesting file")

	fileBytes, err := retrieveFile(retrieveCtx, api, path.New("QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4"))
	if err != nil {
		log.Fatalf("could not retrieve file: %s", err)
	}
	log.Printf("time passed: %s", time.Since(start))
	log.Printf("length: %d", len(fileBytes))
}

The context deadline is always exceeded and does not connect to a valid peer in time. There should be a better system for connecting to peers (shorter timeouts? address sanity checks? banning support for peers who send invalid nodes?) as the current system takes too long to find any peers.

22:57:06.576 WARNI    p2pnode: unknown or duplicate muxer in LIBP2P_MUX_PREFS: %s /mplex/6.7.0 smux.go:39
22:57:06.588  INFO    p2pnode: Swarm listening at: [/p2p-circuit /ip4/127.0.0.1/tcp/4001 /ip4/192.168.15.110/tcp/4001 /ip6/::1/tcp/4001 /ip6/2601:644:8400:5516::4b4/tcp/4001 /ip6/fd50:323d:26f3::4b4/tcp/4001 /ip6/fd50:323d:26f3:0:181a:f4dd:cf81:c0a4/tcp/4001 /ip6/2601:644:8400:5516:ed4d:7c55:4517:7ad1/tcp/4001] addrs.go:114
22:57:06.590  INFO        dht: starting bootstrap query (0/1) to random ID (routing table size was 0) dht_bootstrap.go:151
22:57:06.590  INFO        dht: finished bootstrap query (0/1) to random ID (routing table size is now 0) dht_bootstrap.go:154
22:57:06.590 WARNI        dht: error bootstrapping: failed to find any peer in table dht_bootstrap.go:86
22:57:07.256  INFO      relay: new relay stream from: QmSoLV4Bbm51jM9C4gDYZQ9Cy3U6aXMJDAbzgu2fzaDs64 relay.go:234
22:57:07.471  INFO      relay: new relay stream from: QmSoLV4Bbm51jM9C4gDYZQ9Cy3U6aXMJDAbzgu2fzaDs64 relay.go:234
22:57:07.474  INFO  bootstrap: bootstrapped with QmSoLV4Bbm51jM9C4gDYZQ9Cy3U6aXMJDAbzgu2fzaDs64 bootstrap.go:187
22:57:07.873  INFO  bootstrap: bootstrapped with QmSoLSafTMBsPKadTEgaXctDQVcqN88CNLHXMkTNwMKPnu bootstrap.go:187
2019/09/21 22:57:07 requesting file
22:57:07.874  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:57:08.874  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:57:09.440  INFO     swarm2: got error on dial: dial tcp4 222.185.37.52:46890: connect: connection refused swarm_dial.go:419
22:57:09.874  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:57:11.579  INFO  basichost: DiscoverNAT error: no NAT found natmgr.go:96
22:57:11.874  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:57:13.988  INFO     swarm2: got error on dial: dial tcp4 0.0.0.0:4001->192.168.0.104:4001: i/o timeout swarm_dial.go:419
22:57:13.988  INFO     swarm2: got error on dial: dial tcp4 0.0.0.0:4001->172.17.0.1:4001: i/o timeout swarm_dial.go:419
22:57:13.989  INFO     swarm2: got error on dial: dial tcp4 0.0.0.0:4001->147.75.106.163:4001: i/o timeout swarm_dial.go:419
22:57:13.989  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] dial tcp4 0.0.0.0:4001->147.75.106.163:4001: i/o timeout swarm_dial.go:419
22:57:13.989  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] dial tcp4 0.0.0.0:4001->147.75.106.163:4001: i/o timeout swarm_dial.go:419
22:57:13.989  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] dial tcp4 0.0.0.0:4001->147.75.106.163:4001: i/o timeout swarm_dial.go:419
22:57:13.989  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] dial tcp4 0.0.0.0:4001->147.75.106.163:4001: i/o timeout swarm_dial.go:419
22:57:13.989  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] dial tcp4 0.0.0.0:4001->147.75.106.163:4001: i/o timeout swarm_dial.go:419
22:57:13.989  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] dial tcp4 0.0.0.0:4001->147.75.106.163:4001: i/o timeout swarm_dial.go:392
22:57:13.989  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] dial tcp4 0.0.0.0:4001->147.75.106.163:4001: i/o timeout swarm_dial.go:392
22:57:13.989  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] dial tcp4 0.0.0.0:4001->147.75.106.163:4001: i/o timeout swarm_dial.go:392
22:57:13.989  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] dial tcp4 0.0.0.0:4001->147.75.106.163:4001: i/o timeout swarm_dial.go:392
22:57:13.989  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] dial tcp4 0.0.0.0:4001->147.75.106.163:4001: i/o timeout swarm_dial.go:392
22:57:13.989  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:392
22:57:13.989  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:392
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:392
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:392
22:57:13.990  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.990  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:392
22:57:13.991  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:13.991  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:392
22:57:14.388  INFO     swarm2: got error on dial: dial tcp4 180.119.157.82:1268: connect: connection refused swarm_dial.go:419
22:57:14.875  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:57:15.475  INFO     swarm2: got error on dial: dial tcp4 117.91.119.89:28171: connect: connection refused swarm_dial.go:419
22:57:15.780  INFO    bitswap: cant open message sender to peer QmW9m57aiBDHAkKj9nmFSEn7ZqrcF1fZS4bipsTCHburei: protocol not supported messagequeue.go:192
22:57:16.783  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:57:16.783  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:392
22:57:17.187  INFO     swarm2: got error on dial: dial tcp6 [fdaf:9692:1e7e:0:a5df:7b3f:c17:8a37]:4001: connect: permission denied swarm_dial.go:419
22:57:18.875  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:57:23.875  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:57:24.970  INFO    connmgr: tried to remove tag from untracked peer:  QmQKYBThzDA1n3soJfFnaAs96xmWRvQEAAz96JxQintYz7 connmgr.go:332
22:57:29.876  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:57:36.723  INFO      relay: new relay stream from: QmSoLPppuBtQSGwKDZT2M73ULpjvfd3aZ6ha4oFGL1KrGM relay.go:234
22:57:36.762  INFO      relay: new relay stream from: QmSoLPppuBtQSGwKDZT2M73ULpjvfd3aZ6ha4oFGL1KrGM relay.go:234
22:57:36.797  INFO  bootstrap: bootstrapped with QmSoLPppuBtQSGwKDZT2M73ULpjvfd3aZ6ha4oFGL1KrGM bootstrap.go:187
22:57:36.876  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:57:44.877  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:57:45.747  INFO    bitswap: cant open message sender to peer QmW9m57aiBDHAkKj9nmFSEn7ZqrcF1fZS4bipsTCHburei: protocol not supported messagequeue.go:192
22:57:53.877  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:58:03.877  INFO    bitswap: want blocks: [QmRAQB6YaCyidP37UdDnjFY5vQuiBrcqdyoW1CuDgwxkD4] wantmanager.go:73
22:58:06.612  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:58:06.612  INFO     swarm2: got error on dial: dial tcp6 [::1]:47604: connect: connection refused swarm_dial.go:419
22:58:06.612  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:419
22:58:06.612  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:392
22:58:06.612  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:392
22:58:06.612  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:47604: connect: connection refused swarm_dial.go:392
22:58:06.649  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:58:06.649  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:419
22:58:06.649  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:419
22:58:06.649  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:58:06.649  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:392
22:58:06.649  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:392
22:58:06.667  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:58:06.667  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:392
22:58:06.667  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:58:06.667  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:419
22:58:06.667  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:392
22:58:06.667  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:392
22:58:06.667  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:58:06.668  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:392
22:58:06.668  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:58:06.668  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:392
22:58:07.016  INFO     swarm2: got error on dial: dial tcp4 221.229.24.69:14855: connect: connection refused swarm_dial.go:419
22:58:07.021  INFO     swarm2: got error on dial: dial tcp4 117.91.119.89:7598: connect: connection refused swarm_dial.go:419
22:58:07.027  INFO     swarm2: got error on dial: dial tcp4 180.119.157.82:12007: connect: connection refused swarm_dial.go:419
22:58:07.030  INFO     swarm2: got error on dial: dial tcp4 117.91.119.89:11886: connect: connection refused swarm_dial.go:419
22:58:07.040  INFO     swarm2: got error on dial: failed to negotiate security stream multiplexer: protocol not supported swarm_dial.go:419
22:58:07.040  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] failed to negotiate security stream multiplexer: protocol not supported swarm_dial.go:419
22:58:07.041  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] failed to negotiate security stream multiplexer: protocol not supported swarm_dial.go:392
22:58:07.041  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] failed to negotiate security stream multiplexer: protocol not supported swarm_dial.go:419
22:58:07.040  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] failed to negotiate security stream multiplexer: protocol not supported swarm_dial.go:419
22:58:07.041  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] failed to negotiate security stream multiplexer: protocol not supported swarm_dial.go:392
22:58:07.041  INFO     swarm2: got error on dial: failed to dial : all dials failed
  * [/ip4/147.75.106.163/tcp/4001] failed to negotiate security stream multiplexer: protocol not supported swarm_dial.go:392
22:58:07.041  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:419
22:58:07.041  INFO     swarm2: got error on dial: dial backoff swarm_dial.go:392
22:58:07.041  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:24437: connect: connection refused swarm_dial.go:419
22:58:07.041  INFO     swarm2: got error on dial: dial tcp6 [::1]:24437: connect: connection refused swarm_dial.go:392
22:58:07.041  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:58:07.041  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:419
22:58:07.041  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:419
22:58:07.042  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:58:07.041  INFO     swarm2: got error on dial: dial tcp4 127.0.0.1:10001: connect: connection refused swarm_dial.go:419
22:58:07.042  INFO     swarm2: got error on dial: dial tcp4 117.91.114.168:13755: connect: connection refused swarm_dial.go:419
22:58:07.047  INFO     swarm2: got error on dial: dial tcp4 58.220.81.133:1175: connect: connection refused swarm_dial.go:419
22:58:07.048  INFO     swarm2: got error on dial: dial tcp4 58.220.81.133:22364: connect: connection refused swarm_dial.go:419
22:58:07.054  INFO     swarm2: got error on dial: dial tcp4 58.220.81.133:22552: connect: connection refused swarm_dial.go:419
22:58:07.064  INFO     swarm2: got error on dial: dial tcp6 [::1]:10001: connect: connection refused swarm_dial.go:419
22:58:07.090  INFO     swarm2: got error on dial: dial tcp4 222.139.181.174:1163: connect: connection refused swarm_dial.go:419
22:58:07.090  INFO     swarm2: got error on dial: dial tcp4 58.220.81.133:1756: connect: connection refused swarm_dial.go:419
2019/09/21 22:58:07 could not retrieve file: context deadline exceeded
exit status 1
@meyer9 meyer9 added the kind/bug A bug in existing code (including security flaws) label Sep 22, 2019
@meyer9 meyer9 changed the title Retrieving file from IPFS takes over 1 minute on a fresh repository Bootstrapping IPFS takes over 1 minute on a fresh repository Sep 22, 2019
@meyer9
Copy link
Author

meyer9 commented Sep 22, 2019

I did more investigation on this issue and it seems like it boils down to 3 main problems:

  1. DHT bootstrap is called before bootstrapping to builtin addresses and so it never queries any peers for the first run. It then has to wait a full period (5 minutes I believe) to try bootstrapping again.

  2. DHT bootstrap connects to many nonsensical addresses which sometimes trigger a timeout after a while and block dialing to other peers.

  3. Other peers are returning bad peers that they aren't connected to and that causes more i/o timeouts and connection refused errors.

@Stebalien
Copy link
Member

This is #6383.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

No branches or pull requests

2 participants