Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved downloader webseed performance #10715

Merged
merged 62 commits into from
Jul 1, 2024
Merged

Improved downloader webseed performance #10715

merged 62 commits into from
Jul 1, 2024

Conversation

mh0lt
Copy link
Contributor

@mh0lt mh0lt commented Jun 12, 2024

This contains fixes, mainly in erigontech/torrent which improve the parallelization of the lib to support downloads speeds over ~25MB per second on a reliable basis.

--torrent.download.rate=256mb --torrent.download.slots=32 should now run a download to completion at a consistent ~250MB/second download rate, assuming 64GB of available memory.

If more memory is available --torrent.download.rate=512mb --torrent.download.slots=48 will around ~400MB/second. Its not yet clear where the loss in performance is but for this version ofthe code it seems 400MB/second is around the maximum it can support.

Outstanding Issues

The current version of the code is memory hungry at high bandwidths the reason behind this is the way the http data is dealt with under high load. The buffer model is currently:

http->hashing->mmap

Where both the http connection and the torrent hasher will retain intermediate buffers until they are finally flushed to te memory mapped file. A more memory efficient model would be to get the http connection to write directly to the memory mapped segment which can the be directly hashed. This will require further code modification - which is outside of the scope of this change.

Downloader changes

Along with the torrent lib changes a number of changes have been made to the downloader code. The most significant are:

  • d.webseeds.DownloadAndSaveTorrentFile has been added to the post processing step after webseed torrents are downloaded. This is becuase it appears that if this is not done certian scenarios will lead to a torrrent's metadata never becoming availible. (If the existing checks are made before the download is finished)
  • mdbxPieceCompletion now has a Flushed(infoHash infohash.T, flushed *roaring.Bitmap) method which is used to commit the completion status to the db after an asynchronous flush of the mmap files has been made. This means that the completion state will only be confirmed once the data is flushed. (This may lead to re-downloading of peices in the case of a crash.

// if we have created a torrent bit it has no info assume that the
// webseed download either has not been called yet or has failed and
// try again here - otherwise the torrent will be left with no info
if t.Info() == nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

select {
case <-t.GotInfo():
   continue
default:
}

is better than t.Info() from "mutex contention" perspective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Info now has its own mutex - so contention is minimal - as its only locked written by set info. So I think that t.Info() should be preferred for call simplicity.

if ok && err == nil {
_, _, err = addTorrentFile(d.ctx, ts, d.torrentClient, d.db, d.webseeds)
if err != nil {
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.Debug the err

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -65,7 +65,7 @@ func Default() *torrent.ClientConfig {
// better don't increase because erigon periodically producing "new seedable files" - and adding them to downloader.
// it must not impact chain tip sync - so, limit resources to minimum by default.
// but when downloader is started as a separated process - rise it to max
//torrentConfig.PieceHashersPerTorrent = max(1, runtime.NumCPU()-1)
torrentConfig.PieceHashersPerTorrent = max(1, runtime.NumCPU()-2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because there is no global limiter for hashers - on non-nvme drives it may crash with > 10K threads created panic. because all threads will wait for overloaded/just_slow disk (and go creating dedicated thread for each IO-blocked goroutine).

m.mu.Lock()
defer m.mu.Unlock()

tx, err := m.db.BeginRwNosync(context.Background())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use BeginRw instead of BeginRwNosync. otherwise you can loose last committed transaction on power-off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


m.putFlushed(tx, infoHash, flushed)

tx.Commit()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if commit returns error?

Copy link
Collaborator

@AskAlexSharov AskAlexSharov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like it. left couple comments. plz address.
show torrent lib diff plz. maybe i can review a bit there also.

m.mu.Lock()
defer m.mu.Unlock()
//fmt.Println("FLUSH", infoHash)
m.db.Batch(func(tx kv.RwTx) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batch - is designed for much parallel calls. but currently it's behind mutex. In this case Update will perform better.
Or maybe it's possible to do behind mutext only map get/put.
Or just don't use mdbxPieceCompletionBatch if Update inside Set is not a problem anymore.
up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this so now we're using the original unbatched version of the code - with the alterations you suggested.

@mh0lt
Copy link
Contributor Author

mh0lt commented Jun 26, 2024

got > 100gb RAM use. Pushed this commit to fix it 997cb9d and with DL_HASHERS=8 --torrent.download.slots=50 (on nvme) see:

[INFO] [06-17|04:08:31.889] [1/12 Snapshots] download                progress="14.86% 165.4GB/1.1TB" time-left=0hrs:5m total-time=4m0s download=3.0GB/s upload=0B/s peers=0 files=350 metadata=350/350 connections=0 alloc=12.1GB sys=38.3GB

it used peak 38Gb ram at 3.0GB/s speed.

heap

I have found with sepolia things seems to run ok with default settings, for bor-mainnet I found I need to set DL hashers to 20 in order to avoind memory growth - the hashers can't keep up with the data download. Too may hahsers though and CPU's get maxed out, althouhg hashing iteslf only seems to hake ~25% of the processing time. The rest is copying memory and GC.

@mh0lt
Copy link
Contributor Author

mh0lt commented Jun 26, 2024

Also did 1 run with -race (sorry for spamming :-) ):

==================
WARNING: DATA RACE
Read at 0x00c0022242f0 by goroutine 9069:
  github.com/bahlo/generic-list-go.(*List[go.shape.uint32]).Front()
      /home/ubuntu/go/pkg/mod/github.com/bahlo/generic-list-go@v0.2.0/list.go:70 +0x55
  github.com/anacrolix/torrent.(*orderedBitmap[go.shape.uint32]).Iterate()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/ordered-bitmap.go:45 +0x2b
  github.com/anacrolix/torrent.(*orderedBitmap[github.com/anacrolix/torrent/request-strategy.RequestIndex]).Iterate()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/ordered-bitmap.go:44 +0x44
  github.com/anacrolix/torrent.(*webseedPeer).requester()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/webseed-peer.go:169 +0x2e9
  github.com/anacrolix/torrent.(*Torrent).addWebSeed.gowrap2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:3192 +0x44

Previous write at 0x00c0022242f0 by goroutine 111227:
  github.com/bahlo/generic-list-go.(*List[go.shape.uint32]).remove()
      /home/ubuntu/go/pkg/mod/github.com/bahlo/generic-list-go@v0.2.0/list.go:114 +0x316
  github.com/bahlo/generic-list-go.(*List[go.shape.uint32]).Remove()
      /home/ubuntu/go/pkg/mod/github.com/bahlo/generic-list-go@v0.2.0/list.go:138 +0x15c
  github.com/anacrolix/torrent.(*orderedBitmap[go.shape.uint32]).CheckedRemove()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/ordered-bitmap.go:56 +0x7a
  github.com/anacrolix/torrent.(*orderedBitmap[github.com/anacrolix/torrent/request-strategy.RequestIndex]).CheckedRemove()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/ordered-bitmap.go:52 +0x3e
  github.com/anacrolix/torrent.(*Peer).deleteRequest.func1()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/peer.go:924 +0x107
  github.com/anacrolix/torrent.(*Peer).deleteRequest()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/peer.go:936 +0x188
  github.com/anacrolix/torrent.(*Peer).cancel()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/peer.go:548 +0x4a
  github.com/anacrolix/torrent.(*Torrent).cancelRequest()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:3237 +0x67
  github.com/anacrolix/torrent.(*Torrent).cancelRequestsForPiece()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2614 +0x20a
  github.com/anacrolix/torrent.(*Torrent).onPieceCompleted()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2620 +0x77
  github.com/anacrolix/torrent.(*Torrent).pieceCompletionChanged()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:1636 +0x17a
  github.com/anacrolix/torrent.(*Torrent).updatePieceCompletion()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:1731 +0x546
  github.com/anacrolix/torrent.(*Torrent).pieceHashed()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2607 +0xc67
  github.com/anacrolix/torrent.(*Torrent).processHashResults.(*Torrent).processHashResults.func1.func2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2776 +0x138
  golang.org/x/sync/errgroup.(*Group).Go.func1()
      /home/ubuntu/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x91

Goroutine 9069 (running) created at:
  github.com/anacrolix/torrent.(*Torrent).addWebSeed()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:3192 +0x7be
  github.com/anacrolix/torrent.(*Torrent).MergeSpec()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/client.go:1483 +0x424
  github.com/anacrolix/torrent.(*Client).AddTorrentSpec()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/client.go:1456 +0x1ba
  github.com/ledgerwatch/erigon-lib/downloader._addTorrentFile()
      /home/ubuntu/erigon/erigon-lib/downloader/util.go:331 +0xa9d
  github.com/ledgerwatch/erigon-lib/downloader.addTorrentFile()
      /home/ubuntu/erigon/erigon-lib/downloader/util.go:308 +0x284
  github.com/ledgerwatch/erigon-lib/downloader.(*Downloader).AddMagnetLink()
      /home/ubuntu/erigon/erigon-lib/downloader/downloader.go:2508 +0x516
  github.com/ledgerwatch/erigon-lib/downloader.(*GrpcServer).Add()
      /home/ubuntu/erigon/erigon-lib/downloader/downloader_grpc_server.go:80 +0x954
  github.com/ledgerwatch/erigon-lib/direct.(*DownloaderClient).Add()
      /home/ubuntu/erigon/erigon-lib/direct/downloader_client.go:36 +0x5b
  github.com/ledgerwatch/erigon/turbo/snapshotsync.RequestSnapshotsDownload()
      /home/ubuntu/erigon/turbo/snapshotsync/snapshotsync.go:68 +0x86
  github.com/ledgerwatch/erigon/turbo/snapshotsync.WaitForDownloader()
      /home/ubuntu/erigon/turbo/snapshotsync/snapshotsync.go:340 +0xe50
  github.com/ledgerwatch/erigon/eth/stagedsync.DownloadAndIndexSnapshotsIfNeed()
      /home/ubuntu/erigon/eth/stagedsync/stage_snapshots.go:274 +0xcd1
  github.com/ledgerwatch/erigon/eth/stagedsync.SpawnStageSnapshots()
      /home/ubuntu/erigon/eth/stagedsync/stage_snapshots.go:158 +0x1ce
  github.com/ledgerwatch/erigon/eth/stagedsync.PipelineStages.func1()
      /home/ubuntu/erigon/eth/stagedsync/default_stages.go:281 +0x152
  github.com/ledgerwatch/erigon/eth/stagedsync.(*Sync).runStage()
      /home/ubuntu/erigon/eth/stagedsync/sync.go:513 +0x285
  github.com/ledgerwatch/erigon/eth/stagedsync.(*Sync).Run()
      /home/ubuntu/erigon/eth/stagedsync/sync.go:383 +0x554
  github.com/ledgerwatch/erigon/turbo/stages.ProcessFrozenBlocks()
      /home/ubuntu/erigon/turbo/stages/stageloop.go:126 +0xcf
  github.com/ledgerwatch/erigon/turbo/execution/eth1.(*EthereumExecutionModule).Start()
      /home/ubuntu/erigon/turbo/execution/eth1/ethereum_execution.go:313 +0x1a8
  github.com/ledgerwatch/erigon/eth.(*Ethereum).Start.gowrap1()
      /home/ubuntu/erigon/eth/backend.go:1532 +0x4f

Goroutine 111227 (running) created at:
  golang.org/x/sync/errgroup.(*Group).Go()
      /home/ubuntu/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:75 +0x124
  github.com/anacrolix/torrent.(*Torrent).processHashResults.func1()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2762 +0x3d5
  github.com/anacrolix/torrent.(*Torrent).processHashResults()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2780 +0x451
  github.com/anacrolix/torrent.(*Torrent).tryCreateMorePieceHashers.func1.gowrap2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2666 +0x33
==================
==================
WARNING: DATA RACE
Write at 0x00c048ef1ee1 by goroutine 111668:
  github.com/anacrolix/torrent.(*Torrent).pieceHashed.func2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2521 +0x48
  runtime.deferreturn()
      /usr/local/go/src/runtime/panic.go:602 +0x5d
  github.com/anacrolix/torrent.(*Torrent).processHashResults.(*Torrent).processHashResults.func1.func2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2776 +0x138
  golang.org/x/sync/errgroup.(*Group).Go.func1()
      /home/ubuntu/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x91

Previous read at 0x00c048ef1ee1 by goroutine 113352:
  github.com/anacrolix/torrent.(*Piece).uncachedPriority()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/piece.go:246 +0x5e
  github.com/anacrolix/torrent.(*Torrent).piecePriority()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:1619 +0x44
  github.com/anacrolix/torrent.(*Torrent).pieceState()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:694 +0x1a4
  github.com/anacrolix/torrent.(*Torrent).publishPieceStateChange()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:1438 +0x22a
  github.com/anacrolix/torrent.(*Torrent).pieceHashed()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2518 +0x8d8
  github.com/anacrolix/torrent.(*Torrent).processHashResults.(*Torrent).processHashResults.func1.func2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2776 +0x138
  golang.org/x/sync/errgroup.(*Group).Go.func1()
      /home/ubuntu/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x91

Goroutine 111668 (running) created at:
  golang.org/x/sync/errgroup.(*Group).Go()
      /home/ubuntu/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:75 +0x124
  github.com/anacrolix/torrent.(*Torrent).processHashResults.func1()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2762 +0x3d5
  github.com/anacrolix/torrent.(*Torrent).processHashResults()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2780 +0x451
  github.com/anacrolix/torrent.(*Torrent).tryCreateMorePieceHashers.func1.gowrap2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2666 +0x33

Goroutine 113352 (running) created at:
  golang.org/x/sync/errgroup.(*Group).Go()
      /home/ubuntu/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:75 +0x124
  github.com/anacrolix/torrent.(*Torrent).processHashResults.func1()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2762 +0x3d5
  github.com/anacrolix/torrent.(*Torrent).processHashResults()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2780 +0x451
  github.com/anacrolix/torrent.(*Torrent).tryCreateMorePieceHashers.func1.gowrap2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2666 +0x33
==================
==================
WARNING: DATA RACE
Write at 0x00c002224870 by goroutine 107244:
  github.com/bahlo/generic-list-go.(*List[go.shape.uint32]).remove()
      /home/ubuntu/go/pkg/mod/github.com/bahlo/generic-list-go@v0.2.0/list.go:114 +0x316
  github.com/bahlo/generic-list-go.(*List[go.shape.uint32]).Remove()
      /home/ubuntu/go/pkg/mod/github.com/bahlo/generic-list-go@v0.2.0/list.go:138 +0x15c
  github.com/anacrolix/torrent.(*orderedBitmap[go.shape.uint32]).CheckedRemove()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/ordered-bitmap.go:56 +0x7a
  github.com/anacrolix/torrent.(*orderedBitmap[github.com/anacrolix/torrent/request-strategy.RequestIndex]).CheckedRemove()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/ordered-bitmap.go:52 +0x3e
  github.com/anacrolix/torrent.(*Peer).deleteRequest.func1()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/peer.go:924 +0x107
  github.com/anacrolix/torrent.(*Peer).deleteRequest()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/peer.go:936 +0x188
  github.com/anacrolix/torrent.(*Peer).receiveChunk()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/peer.go:764 +0x9b2
  github.com/anacrolix/torrent.(*webseedPeer).requestResultHandler()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/webseed-peer.go:514 +0x844
  github.com/anacrolix/torrent.(*webseedPeer).doRequest.func1()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/webseed-peer.go:151 +0x172
  github.com/anacrolix/torrent.(*webseedPeer).doRequest()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/webseed-peer.go:152 +0x2f0
  github.com/anacrolix/torrent.(*webseedPeer).requester.func1()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/webseed-peer.go:184 +0x196
  github.com/anacrolix/torrent.(*orderedBitmap[go.shape.uint32]).Iterate()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/ordered-bitmap.go:46 +0xa8
  github.com/anacrolix/torrent.(*orderedBitmap[github.com/anacrolix/torrent/request-strategy.RequestIndex]).Iterate()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/ordered-bitmap.go:44 +0x44
  github.com/anacrolix/torrent.(*webseedPeer).requester()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/webseed-peer.go:169 +0x2e9
  github.com/anacrolix/torrent.(*webseedPeer).requester.gowrap2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/webseed-peer.go:220 +0x44

Previous read at 0x00c002224870 by goroutine 12072:
  github.com/bahlo/generic-list-go.(*List[go.shape.uint32]).Front()
      /home/ubuntu/go/pkg/mod/github.com/bahlo/generic-list-go@v0.2.0/list.go:70 +0x55
  github.com/anacrolix/torrent.(*orderedBitmap[go.shape.uint32]).Iterate()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/ordered-bitmap.go:45 +0x2b
  github.com/anacrolix/torrent.(*orderedBitmap[github.com/anacrolix/torrent/request-strategy.RequestIndex]).Iterate()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/ordered-bitmap.go:44 +0x44
  github.com/anacrolix/torrent.(*webseedPeer).requester()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/webseed-peer.go:169 +0x2e9
  github.com/anacrolix/torrent.(*Torrent).addWebSeed.gowrap2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:3192 +0x44

Goroutine 107244 (running) created at:
  github.com/anacrolix/torrent.(*webseedPeer).requester()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/webseed-peer.go:220 +0x990
  github.com/anacrolix/torrent.(*Torrent).addWebSeed.gowrap2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:3192 +0x44

Goroutine 12072 (running) created at:
  github.com/anacrolix/torrent.(*Torrent).addWebSeed()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:3192 +0x7be
  github.com/anacrolix/torrent.(*Torrent).MergeSpec()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/client.go:1483 +0x424
  github.com/anacrolix/torrent.(*Client).AddTorrentSpec()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/client.go:1456 +0x1ba
  github.com/ledgerwatch/erigon-lib/downloader._addTorrentFile()
      /home/ubuntu/erigon/erigon-lib/downloader/util.go:331 +0xa9d
  github.com/ledgerwatch/erigon-lib/downloader.addTorrentFile()
      /home/ubuntu/erigon/erigon-lib/downloader/util.go:308 +0x284
  github.com/ledgerwatch/erigon-lib/downloader.(*Downloader).AddMagnetLink()
      /home/ubuntu/erigon/erigon-lib/downloader/downloader.go:2508 +0x516
  github.com/ledgerwatch/erigon-lib/downloader.(*GrpcServer).Add()
      /home/ubuntu/erigon/erigon-lib/downloader/downloader_grpc_server.go:80 +0x954
  github.com/ledgerwatch/erigon-lib/direct.(*DownloaderClient).Add()
      /home/ubuntu/erigon/erigon-lib/direct/downloader_client.go:36 +0x5b
  github.com/ledgerwatch/erigon/turbo/snapshotsync.RequestSnapshotsDownload()
      /home/ubuntu/erigon/turbo/snapshotsync/snapshotsync.go:68 +0x86
  github.com/ledgerwatch/erigon/turbo/snapshotsync.WaitForDownloader()
      /home/ubuntu/erigon/turbo/snapshotsync/snapshotsync.go:340 +0xe50
  github.com/ledgerwatch/erigon/eth/stagedsync.DownloadAndIndexSnapshotsIfNeed()
      /home/ubuntu/erigon/eth/stagedsync/stage_snapshots.go:274 +0xcd1
  github.com/ledgerwatch/erigon/eth/stagedsync.SpawnStageSnapshots()
      /home/ubuntu/erigon/eth/stagedsync/stage_snapshots.go:158 +0x1ce
  github.com/ledgerwatch/erigon/eth/stagedsync.PipelineStages.func1()
      /home/ubuntu/erigon/eth/stagedsync/default_stages.go:281 +0x152
  github.com/ledgerwatch/erigon/eth/stagedsync.(*Sync).runStage()
      /home/ubuntu/erigon/eth/stagedsync/sync.go:513 +0x285
  github.com/ledgerwatch/erigon/eth/stagedsync.(*Sync).Run()
      /home/ubuntu/erigon/eth/stagedsync/sync.go:383 +0x554
  github.com/ledgerwatch/erigon/turbo/stages.ProcessFrozenBlocks()
      /home/ubuntu/erigon/turbo/stages/stageloop.go:126 +0xcf
  github.com/ledgerwatch/erigon/turbo/execution/eth1.(*EthereumExecutionModule).Start()
      /home/ubuntu/erigon/turbo/execution/eth1/ethereum_execution.go:313 +0x1a8
  github.com/ledgerwatch/erigon/eth.(*Ethereum).Start.gowrap1()
      /home/ubuntu/erigon/eth/backend.go:1532 +0x4f
==================
==================
WARNING: DATA RACE
Write at 0x00c0491807d9 by goroutine 99364:
  github.com/anacrolix/torrent.(*Torrent).pieceHashed.func2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2521 +0x48
  runtime.deferreturn()
      /usr/local/go/src/runtime/panic.go:602 +0x5d
  github.com/anacrolix/torrent.(*Torrent).processHashResults.(*Torrent).processHashResults.func1.func2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2776 +0x138
  golang.org/x/sync/errgroup.(*Group).Go.func1()
      /home/ubuntu/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x91

Previous write at 0x00c0491807d9 by goroutine 99379:
  github.com/anacrolix/torrent.(*Torrent).pieceHashed.func2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2521 +0x48
  runtime.deferreturn()
      /usr/local/go/src/runtime/panic.go:602 +0x5d
  github.com/anacrolix/torrent.(*Torrent).processHashResults.(*Torrent).processHashResults.func1.func2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2776 +0x138
  golang.org/x/sync/errgroup.(*Group).Go.func1()
      /home/ubuntu/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x91

Goroutine 99364 (running) created at:
  golang.org/x/sync/errgroup.(*Group).Go()
      /home/ubuntu/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:75 +0x124
  github.com/anacrolix/torrent.(*Torrent).processHashResults.func1()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2762 +0x3d5
  github.com/anacrolix/torrent.(*Torrent).processHashResults()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2780 +0x451
  github.com/anacrolix/torrent.(*Torrent).tryCreateMorePieceHashers.func1.gowrap2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2666 +0x33

Goroutine 99379 (running) created at:
  golang.org/x/sync/errgroup.(*Group).Go()
      /home/ubuntu/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:75 +0x124
  github.com/anacrolix/torrent.(*Torrent).processHashResults.func1()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2762 +0x3d5
  github.com/anacrolix/torrent.(*Torrent).processHashResults()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2780 +0x451
  github.com/anacrolix/torrent.(*Torrent).tryCreateMorePieceHashers.func1.gowrap2()
      /home/ubuntu/go/pkg/mod/github.com/erigontech/torrent@v1.54.2-alpha-16/torrent.go:2666 +0x33
==================

I have resolved all of the DATA races I see for the moment.

@mh0lt
Copy link
Contributor Author

mh0lt commented Jun 26, 2024

got

panic: invalid reject reading "https://erigon3-v3-snapshots-mainnet.erigon.network/v2/v1-016000-016500-headers.seg" at "bytes=77594624-79691775": read tcp [2001:41d0:303:dd6a::]:44436->[2606:4700:10::6816:4fe9]:443: read: connection reset by peer for: 37

goroutine 16942 [running]:
github.com/anacrolix/torrent.(*webseedPeer).requestResultHandler(0xc037f1f608, {0x25, {0x0, 0x200000}}, {0xc039768e30?, 0xc0bd885920?})
	github.com/anacrolix/torrent@v1.52.6-0.20231201115409-7ea994b6bbd8/webseed-peer.go:508 +0x61f
github.com/anacrolix/torrent.(*webseedPeer).doRequest.func1(0xc037f1f608, {0xae5050?, {0xc0?, 0x5b0d6d48?}}, {0xc039768e30?, 0xc0bd885920?})
	github.com/anacrolix/torrent@v1.52.6-0.20231201115409-7ea994b6bbd8/webseed-peer.go:151 +0xd0
github.com/anacrolix/torrent.(*webseedPeer).doRequest(0xc037f1f608, {0x25, {0x0, 0x200000}})
	github.com/anacrolix/torrent@v1.52.6-0.20231201115409-7ea994b6bbd8/webseed-peer.go:152 +0x15a
github.com/anacrolix/torrent.(*webseedPeer).requester.func1(0x0?)
	github.com/anacrolix/torrent@v1.52.6-0.20231201115409-7ea994b6bbd8/webseed-peer.go:184 +0x16f
github.com/anacrolix/torrent.(*orderedBitmap[...]).Iterate(0x27a5b80?, 0xc090ef89e0?)
	github.com/anacrolix/torrent@v1.52.6-0.20231201115409-7ea994b6bbd8/ordered-bitmap.go:46 +0x43
github.com/anacrolix/torrent.(*webseedPeer).requester(0xc037f1f608, 0xb)
	github.com/anacrolix/torrent@v1.52.6-0.20231201115409-7ea994b6bbd8/webseed-peer.go:169 +0x13f
created by github.com/anacrolix/torrent.(*Torrent).addWebSeed in goroutine 2196
	github.com/anacrolix/torrent@v1.52.6-0.20231201115409-7ea994b6bbd8/torrent.go:3192 +0x4e5

This is resolved - it was caused by a cancelled flag being incorrectly set. I have seen several other panics - which I have investigated and resolved.

I think there may be a couple of issue left since I made the lib more active - but think its worth merging and investigating. The reamining issue seem to be mostly associated with peer flow. Which I'll start to test this week.

@mh0lt mh0lt merged commit 1b060dd into main Jul 1, 2024
10 of 11 checks passed
@mh0lt mh0lt deleted the dl_webseed_halt_fixes branch July 1, 2024 20:05
AskAlexSharov added a commit that referenced this pull request Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants