Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

DAWN-138 ⁃ P2P Network Improvements #291

Closed
bytemaster opened this issue Aug 31, 2017 · 8 comments
Closed

DAWN-138 ⁃ P2P Network Improvements #291

bytemaster opened this issue Aug 31, 2017 · 8 comments
Assignees
Milestone

Comments

@bytemaster
Copy link
Contributor

bytemaster commented Aug 31, 2017

Easy

  • Clean up Console Spam

Stability

  1. Reconnect every 30 seconds to every peer in our config file
  2. ping pong latency measurement protocol and keep alive
  3. on connection check latency / clock skew
  4. Disable Naggle algorithm on connections (may introduce lag)

Usability

  1. RPC API for dynamically adding / removing peers and checking their status
  2. block only mode...send full blocks not pending trx (with delay for last irreversible)

Security

  1. Limit the maximum number of incoming connections
  2. Error reporting and abuse detection
  3. Authentication on handshake

Performance

  1. Don't relay messages to peers that already sent us the message
  2. Send block recipe to reconstruct from trx previously sent
  3. For large messages (block or trx) send notice hash + query to fetch
    • query from only one peer at a time
    • timeout quickly
  4. Multi-Threading of network processing

TODO: create per-item issues on github and link to issue number.

@pmesnier
Copy link
Contributor

I'll bet this in place tonight* (* for some timezone :-)

@bytemaster
Copy link
Contributor Author

In terms of priority:

  1. Stability
  2. Usability
  3. Security
  4. Performance

@bytemaster
Copy link
Contributor Author

Please refine this into individual tickets and mark the parts that are actually in progress as in progress.

@pmesnier
Copy link
Contributor

pmesnier commented Sep 12, 2017

I've gone through all of the sub-issues and marked them as either done or pending a particular issue number. In a couple of cases I have additional questions and a couple I just feel wont get done in the next day or two.

Easy

  • Clean up Console Spam DONE

Stability

  1. Reconnect every 30 seconds to every peer in our config file DONE
  2. ping pong latency measurement protocol and keep alive DONE p2p liveness checking #395
  3. on connection check latency / clock skew DONE p2p liveness checking #395
  4. Disable Nagle algorithm on connections (may introduce lag) DONE p2p network txn handling, new constants, socket options #393
  5. [not previously listed] synchronizing across multiple peers didn't work DONE
  6. [not previously listed] synchronizing newly started nodes needs to survive a slow or crashed supplier Done p2p sync dead peer handoff #406, code mixed with changes for p2p network txn handling, new constants, socket options #393
  7. [not previously listed] synchronizing limited to irreversible blocks only, then use notices for catching up the rest. DONE p2p finish syncing new nodes #515

Usability

  1. RPC API for dynamically adding / removing peers and checking their status Need more information
  2. block only mode...send full blocks not pending trx (with delay for last irreversible) DONE, sort of, see question

Question: with multiple producer nodes, isn't it alwasys necessary for pending transactions to be communicated to a block producer? or is it sufficient for transactions to be held as pending until it's node becomes the current block producer?

Security

  1. Limit the maximum number of incoming connections DONE
  2. Error reporting and abuse detection
  3. Authentication on handshake Pending p2p authenticate peer on handshake #505

Performance

  1. Don't relay messages to peers that already sent us the message DONE p2p network txn handling, new constants, socket options #393
  2. Send block recipe to reconstruct from trx previously sent DONE p2p assemble a block based on previously shared transactions and summary info #282
  3. For large messages (block or trx) send notice hash + query to fetch Pending p2p perf send a notice of large blocks/txns #408
    • query from only one peer at a time
    • timeout quickly
  4. Multi-Threading of network processing

create per-item issues on github and link to issue number. DONE

@pmesnier
Copy link
Contributor

updated summary, and added milestone 2 although most of it is ready for M1.

@heifner heifner assigned heifner and unassigned heifner Sep 13, 2017
pmesnier added a commit that referenced this issue Sep 18, 2017
…ding tasks identified by TODO compiler comments
bytemaster added a commit that referenced this issue Sep 26, 2017
pmesnier added a commit that referenced this issue Oct 5, 2017
…ocaition for test verifying this solution, suppressed debug output and added a config option to net plugin
pmesnier added a commit that referenced this issue Oct 5, 2017
…ocaition for test verifying this solution, suppressed debug output and added a config option to net plugin
@pmesnier
Copy link
Contributor

pmesnier commented Oct 5, 2017

One more iteration of the P2P action list. In this one I've removed all "DONE" items to highlight just where effort is needed.

Usability

  1. RPC API for dynamically adding / removing peers and checking their status Need more information

Question: with multiple producer nodes, isn't it always necessary for pending transactions to be communicated to a block producer? or is it sufficient for transactions to be held as pending until it's node becomes the current block producer?

Security

  1. Error reporting and abuse detection
  2. Authentication on handshake Pending p2p authenticate peer on handshake #505

Performance

  1. For large messages (block or trx) send notice hash + query to fetch Pending p2p perf send a notice of large blocks/txns #408
    • query from only one peer at a time
    • timeout quickly
  2. Multi-Threading of network processing
  3. (new) Use a shared memory pool for creating I/O send buffers, Implement "greedy" reading strategy to get as much p2p data out of the TCP buffers at a time as possible.

@coolspeed
Copy link

Multi-Threading of network processing

Do we need multi-threaded network processing at all?

  • Network operations are not CPU bound, but IO bound. It costs almost zero CPU resource.
    • Even NginX and Node.js process network IO in a single thread.
  • Boost::asio does non-blocking IO.

So, if I got it right, is there any need to have multi-threaded network processing?