Reduce txArriveTimeout to 100ms #627

builder90210 · 2022-12-13T09:41:23Z

Description

I'm suggesting that the polygon team consider lowering the TxFetcher pingback threshold from 500ms to 100ms. This change would have a number of benefits for the network, including faster transaction propagation, improved communication between validators and sentry nodes, and reduced incentives for spamming transactions on the network. Lowering the threshold to 100ms would not significantly increase network load, and is already in use by many power users who care about receiving transactions quickly. This change would improve the overall performance and reliability of the polygon network.

In propagating transactions through the p2p network, a node will send the full transaction directly to a small number of their peers, but the transaction hash to the majority. The TxFetcher is responsible for handling the hash-only announcements on the receiving node. When a hash announcement is received, the TxFetcher will wait some period of time (currently the 500ms), and if it does not receive a direct broadcast in that time, will ping an announcing node to get the full transaction.

This 500ms threshold is a legacy setting from original geth fork. It was quite appropriate for a network with longer block times, and the ability to run on very low end hardware; however for the faster block time and higher infrastructure requirement of polygon, I believe a faster pingback would have a number of benefits.

This is one of the first and most common modifications made by power users who care about receiving transactions faster. I myself have run this setting much lower than 100ms, and noticed no performance issues on my nodes whatsoever. I know several others who have done the same.

Lowering this threshold will have a number of benefits to the network:

This change speeds up transaction propagation and greatly cuts down on the tail latency of full transaction being received at nodes. In particular this allows higher priority (as measured by gas price) reach validator nodes and be mined more quickly. It also allows nodes to see and react to pending state changes faster and more reliably.
This greatly improves the communication between validator and sentry nodes. Throughput between the validator and sentry can be limited by this value, as the TxFetcher only supports one active request to a peer. Normally there are not enough transactions going through the network that this is a problem, but at times the traffic spikes. During these times, high gas price transactions that would otherwise get mined in an earlier block get delayed. This would disproportionately effect smaller validators without sophisticated infrastructure.
Additionally, this will reduce some of the incentive to spam transactions on the network. Within a gas price, transactions are ordered by the time they were received. Due to the longer propagation times, there's significant randomness in when a validator will see a transaction, thereby incentiving spamming if you want a transaction to be ordered earlier within a gas price. Speeding up transactions propagation means transactions submitted earlier more reliably make it to the current validator before those submitted later, thereby reducing the incentive to spam.

I log some portion of network traffic packets that are received by my node, and present some data below on the distribute of latency in these network packets. This 100ms was based on data I gathered in these logs.

Here is a histogram of the latency in milliseconds between first hash announcement vs. the first direct transaction received by my node. A negative number indicates the first direct announcement was received earlier than the first hash announcement. This is run on a 16 core machine, with 250 peers in a Hetzner datacenter, so should be fairly typical for most nodes.

Indeed when we look at percentile latency (below), we see that the current 500ms is on the 98 percentile, meaning that the TxFetcher is used on less than 2% of transactions. 100ms is around the 87 percentile, so would get activated on around 13%, and cut off the tail latency on the difference. This may seems like it should increase load. However when one considers that we expect to rebroadcast a much larger percent of transactions we receive out, we can see that the burden from the p2p layer should not appreciably change. The p2p burden is only a concern for full / sentry nodes, as validators have a very small number of peers due to sentry setups.

My node has a pretty good set of static peers, I would expect this advantage to be greater for nodes who don't prune their peerset.

Changes

Bugfix (non-breaking change that solves an issue)
Hotfix (change that solves an urgent issue, and requires immediate attention)
New feature (non-breaking change that adds functionality)
Breaking change (change that is not backwards-compatible and/or changes current functionality)

Checklist

I have added at least 2 reviewer or the whole pos-v1 team
I have added sufficient documentation in code
I will be resolving comments - if any - by pushing each fix in a separate commit and linking the commit hash in the comment reply

Cross repository changes

This PR requires changes to heimdall
- In case link the PR here:
This PR requires changes to matic-cli
- In case link the PR here:

Testing

I have added unit tests
I have added tests to CI
I have tested this code manually on local environment
I have tested this code manually on remote devnet using express-cli
I have tested this code manually on mumbai
I have created new e2e tests into express-cli
I have run these setting on mainnet for several months.

JekaMas · 2022-12-13T11:26:05Z

Outstanding research! Thank you!
Did you try to collect the same numbers but with the change?

Apart of this question, I believe that it's quite safe to accept it. From our side it'd be great to have txpool metrics also to check how useful the change is.

codecov · 2022-12-13T11:35:45Z

Codecov Report

Base: 56.74% // Head: 56.85% // Increases project coverage by +0.11% 🎉

Coverage data is based on head (243d231) compared to base (a3e7769).
Patch coverage: 5.26% of modified lines in pull request are covered.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #627      +/-   ##
===========================================
+ Coverage    56.74%   56.85%   +0.11%     
===========================================
  Files          607      609       +2     
  Lines        70294    70915     +621     
===========================================
+ Hits         39888    40320     +432     
- Misses       26983    27152     +169     
- Partials      3423     3443      +20

Impacted Files	Coverage Δ
common/path.go	`0.00% <0.00%> (ø)`
eth/fetcher/tx_fetcher.go	`89.88% <ø> (+0.57%)`	⬆️
eth/filters/api.go	`40.27% <0.00%> (-1.58%)`	⬇️
eth/sync.go	`58.69% <ø> (ø)`
internal/cli/server/command.go	`6.77% <ø> (ø)`
params/config.go	`30.39% <ø> (ø)`
params/version.go	`36.84% <ø> (ø)`
rlp/rlpgen/main.go	`8.33% <0.00%> (-0.63%)`	⬇️
metrics/metrics.go	`11.68% <40.00%> (+0.72%)`	⬆️
consensus/misc/eip1559.go	`48.61% <0.00%> (-38.89%)`	⬇️
... and 36 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

builder90210 · 2022-12-14T06:23:03Z

Outstanding research! Thank you! Did you try to collect the same numbers but with the change?

Apart of this question, I believe that it's quite safe to accept it. From our side it'd be great to have txpool metrics also to check how useful the change is.

The fetcher is already pretty extensive instrumented, it should have enough metrics to capture the effects of this change:
https://github.com/maticnetwork/bor/blob/develop/eth/fetcher/tx_fetcher.go#L72

I do have numbers from with the change. They don't look that different (as I would expect), as I don't capture packets that are returned when we request a refetch, only from either hash or direct transaction broadcasts. You can see the tail latency on the upper end does reduce. This is mainly due to us broadcasting out transactions sooner, and thus other nodes not sending us these late broadcasts, so it doesn't accurately capture the latency drop.

cffls · 2022-12-16T05:53:39Z

Hi @builder90210, thank you for writing such a detailed post with data and reasoning. I think it's a good idea to tune these parameters to a certain degree. However, I have a concern regarding the claim that "the burden from the p2p layer should not appreciably change" when we expect to rebroadcast a larger percentage of transactions. Without supporting data, it's difficult to know if the traffic load of serving fetch tx will increase significantly if all nodes within the network change this parameter at the same time. Would you be able to run an experiment in a cluster where all nodes have 100ms configured and collect some data to support this claim? Thank you!

builder90210 · 2022-12-17T00:08:18Z

Hi @builder90210, thank you for writing such a detailed post with data and reasoning. I think it's a good idea to tune these parameters to a certain degree. However, I have a concern regarding the claim that "the burden from the p2p layer should not appreciably change" when we expect to rebroadcast a larger percentage of transactions. Without supporting data, it's difficult to know if the traffic load of serving fetch tx will increase significantly if all nodes within the network change this parameter at the same time. Would you be able to run an experiment in a cluster where all nodes have 100ms configured and collect some data to support this claim? Thank you!

@cffls Here's the analysis that I based the initial claim on. I might have been a little unclear in the original post, I didn't mean to say that with the change we expect to rebroadcast much more. Within the system that already exists, the broadcasting loop outside of the TxFetcher dwarfs any increase we'd see from the fetcher changes. Let me know if you're comfortable with this. If not, what type of data would you like to see? I may already have some, as I have a number of the nodes on the network that are statically peered and have run with changes here.

For each transaction a node sees, it will broadcast that transaction (either hash or full) to peers that it have not sent that transaction to us already. This means that for every transaction, for each of our peers there must be a message sent in one direction, whether it's us notifying the peer we have the transaction, or them notifying us they that have it. We could expect about 50% of these to be sends, and 50% to be receives on average, but the actual percent will vary node-by-node depending on how centrally they are located in the network. This is a conservative model, as it doesn't account for latency, which means some some transactions we'd expect both nodes to send a message to eachother.

With the TxFetcher change, we expect to do a fetch for around 13% of transactions. However, the TxFetcher will only fetch from a single peer concurrently per transaction. With the timeouts set at current levels, the vast majority of the time that means we'll only ping back to a single peer, instead of every peer as with broadcasting. This behavior is covered in the fetcher unit tests, notably here: https://github.com/maticnetwork/bor/blob/develop/eth/fetcher/tx_fetcher_test.go#L641.

We can model the network burden for request type as:
burden = num_transactions * probability_of_message_sent_betweenpeers * num_peers

For the standard rebroadcast, this calculates out to:
burden = num_transactions * 1 * 250

Whereas for the TxFetcher, this calculates out to:
burden = num_transactions * .13 * 1

so the load in the fetcher should be tiny compared to existing load from broadcasting. The above analysis does ignore the size of the messages (fetches the fetcher are larger than announcements due to them being full transactions rather than hashes). This shouldn't effect the conclusion much, as the rebroadcast loop sends out full transactions from 6% of peers, and the burdens are orders of magnitude apart.

Even if the fetcher was activating for every transactions, the additional load would still only be a fraction of the existing broadcast protocol, due to the 1 concurrent request per transaction in the Txfetcher.

You could repeat these analyses for send traffic and receipt traffic independently. The number should work out the same on average, since requests are symmetric in throughput analysis (a request into my node is a request sent by another node).

I believe this analysis should hold for effects on the network as a whole, as I don't see a reason why we'd have compounding effects the more nodes that have the change.

cffls · 2022-12-19T18:33:51Z

Thank you for your clear and thorough explanation, @builder90210. Your response effectively addressed my concern about the compounding effect. Really appreciate the time and effort you took to help me understand this concept.

builder90210 · 2022-12-20T22:08:58Z

Thank you for your clear and thorough explanation, @builder90210. Your response effectively addressed my concern about the compounding effect. Really appreciate the time and effort you took to help me understand this concept.

It's great to be able to contribute :). I have a lot of experience in this area of the code, so happy to help out in the future if you need anything.

JekaMas · 2022-12-22T10:11:45Z

thank you @builder90210 ! I really appreciate your help.

Reduce txArriveTimeout to 100ms

243d231

JekaMas approved these changes Dec 13, 2022

View reviewed changes

0xKrishna assigned Raneet10 Dec 15, 2022

cffls approved these changes Dec 19, 2022

View reviewed changes

temaniarpit27 merged commit 4aa56c5 into maticnetwork:develop Dec 22, 2022

pratikspatil024 mentioned this pull request Jan 30, 2023

Revert "Reduce txArriveTimeout to 100ms" #707

Merged

thogard785 mentioned this pull request Feb 9, 2023

Make txArriveTimeout Configurable w/ CLI Flag #734

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce txArriveTimeout to 100ms #627

Reduce txArriveTimeout to 100ms #627

builder90210 commented Dec 13, 2022 •

edited

Loading

JekaMas commented Dec 13, 2022

codecov bot commented Dec 13, 2022

builder90210 commented Dec 14, 2022 •

edited

Loading

cffls commented Dec 16, 2022

builder90210 commented Dec 17, 2022 •

edited

Loading

cffls commented Dec 19, 2022

builder90210 commented Dec 20, 2022

JekaMas commented Dec 22, 2022

Reduce txArriveTimeout to 100ms #627

Reduce txArriveTimeout to 100ms #627

Conversation

builder90210 commented Dec 13, 2022 • edited Loading

Description

Changes

Checklist

Cross repository changes

Testing

JekaMas commented Dec 13, 2022

codecov bot commented Dec 13, 2022

Codecov Report

builder90210 commented Dec 14, 2022 • edited Loading

cffls commented Dec 16, 2022

builder90210 commented Dec 17, 2022 • edited Loading

cffls commented Dec 19, 2022

builder90210 commented Dec 20, 2022

JekaMas commented Dec 22, 2022

builder90210 commented Dec 13, 2022 •

edited

Loading

builder90210 commented Dec 14, 2022 •

edited

Loading

builder90210 commented Dec 17, 2022 •

edited

Loading