Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce pipeline backpressure #14404

Merged
merged 4 commits into from
Aug 24, 2024
Merged

Reduce pipeline backpressure #14404

merged 4 commits into from
Aug 24, 2024

Conversation

vusirikala
Copy link
Contributor

@vusirikala vusirikala commented Aug 24, 2024

Description

Reduce pipeline or higher workloads.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Other (specify)

How Has This Been Tested?

Key Areas to Review

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Aug 24, 2024

⏱️ 5h 18m total CI duration on this PR
Job Cumulative Duration Recent Runs
execution-performance / single-node-performance 1h 9m 🟩🟩🟥🟩
forge-framework-upgrade-test / forge 45m 🟩🟩🟩
forge-compat-test / forge 41m 🟩🟩🟩
forge-e2e-test / forge 40m 🟩🟩🟩
test-target-determinator 23m 🟩🟩🟩
execution-performance / test-target-determinator 17m 🟩🟩🟩🟩
check 12m 🟩🟩🟩
rust-move-tests 9m 🟩
rust-move-tests 9m 🟩
rust-cargo-deny 8m 🟩🟩🟩🟩
rust-move-tests 8m 🟩
general-lints 8m 🟩🟩🟩🟩
check-dynamic-deps 5m 🟩🟩🟩🟩🟩
rust-doc-tests 5m 🟩
rust-doc-tests 5m 🟩
rust-doc-tests 5m 🟩
rust-move-tests 3m
semgrep/ci 2m 🟩🟩🟩🟩🟩
file_change_determinator 58s 🟩🟩🟩🟩🟩
rust-move-tests 55s
rust-doc-tests 54s
file_change_determinator 53s 🟩🟩🟩🟩🟩
file_change_determinator 47s 🟩🟩🟩🟩
permission-check 21s 🟩🟩🟩🟩🟩
permission-check 16s 🟩🟩🟩🟩🟩
permission-check 15s 🟩🟩🟩🟩🟩
permission-check 13s 🟩🟩🟩🟩🟩
permission-check 11s 🟩🟩🟩🟩
determine-docker-build-metadata 8s 🟩🟩🟩🟩
Backport PR 4s 🟥
permission-check 2s 🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

@vusirikala vusirikala requested review from igor-aptos, bchocho and sitalkedia and removed request for gregnazario and JoshLind August 24, 2024 00:29
@vusirikala vusirikala enabled auto-merge (squash) August 24, 2024 00:32

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

@ibalajiarun ibalajiarun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hoping there was internal alignment on these values.

@@ -241,25 +241,25 @@ impl Default for ConsensusConfig {
// Block enters the pipeline after consensus orders it, and leaves the
// pipeline once quorum on execution result among validators has been reached
// (so-(badly)-called "commit certificate"), meaning 2f+1 validators have finished execution.
back_pressure_pipeline_latency_limit_ms: 800,
back_pressure_pipeline_latency_limit_ms: 1000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given our node is hitting 1s latency under expected load, I think this needs to be higher than 1s.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increased this limit further.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@vusirikala vusirikala changed the title Reduce pipeline and execution backpressure Reduce pipeline backpressure Aug 24, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on d1bf834728a0cf166d993f4728dfca54f3086fb0 ==> 8a7130cf1198560ea4b25e01d1cfdf373c526e04

Compatibility test results for d1bf834728a0cf166d993f4728dfca54f3086fb0 ==> 8a7130cf1198560ea4b25e01d1cfdf373c526e04 (PR)
1. Check liveness of validators at old version: d1bf834728a0cf166d993f4728dfca54f3086fb0
compatibility::simple-validator-upgrade::liveness-check : committed: 12631.20 txn/s, latency: 2729.50 ms, (p50: 1900 ms, p90: 3100 ms, p99: 17500 ms), latency samples: 486120
2. Upgrading first Validator to new version: 8a7130cf1198560ea4b25e01d1cfdf373c526e04
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7747.38 txn/s, latency: 3684.05 ms, (p50: 4200 ms, p90: 4500 ms, p99: 4600 ms), latency samples: 144760
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6751.04 txn/s, latency: 4741.87 ms, (p50: 4700 ms, p90: 7000 ms, p99: 7300 ms), latency samples: 243460
3. Upgrading rest of first batch to new version: 8a7130cf1198560ea4b25e01d1cfdf373c526e04
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 7001.86 txn/s, latency: 4058.91 ms, (p50: 4400 ms, p90: 5200 ms, p99: 5600 ms), latency samples: 143440
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 7562.93 txn/s, latency: 4234.60 ms, (p50: 4300 ms, p90: 6500 ms, p99: 6700 ms), latency samples: 256140
4. upgrading second batch to new version: 8a7130cf1198560ea4b25e01d1cfdf373c526e04
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 11161.10 txn/s, latency: 2357.20 ms, (p50: 2500 ms, p90: 2900 ms, p99: 3100 ms), latency samples: 202820
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 11259.15 txn/s, latency: 2813.30 ms, (p50: 2700 ms, p90: 3300 ms, p99: 3700 ms), latency samples: 372000
5. check swarm health
Compatibility test for d1bf834728a0cf166d993f4728dfca54f3086fb0 ==> 8a7130cf1198560ea4b25e01d1cfdf373c526e04 passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 8a7130cf1198560ea4b25e01d1cfdf373c526e04

two traffics test: inner traffic : committed: 11993.19 txn/s, latency: 3319.10 ms, (p50: 3000 ms, p90: 4100 ms, p99: 6300 ms), latency samples: 4560040
two traffics test : committed: 100.02 txn/s, latency: 2752.38 ms, (p50: 2500 ms, p90: 3400 ms, p99: 9500 ms), latency samples: 1760
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.259, avg: 0.224", "QsPosToProposal: max: 0.431, avg: 0.385", "ConsensusProposalToOrdered: max: 0.342, avg: 0.320", "ConsensusOrderedToCommit: max: 0.621, avg: 0.591", "ConsensusProposalToCommit: max: 0.939, avg: 0.911"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.95s no progress at version 2488421 (avg 0.23s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 7.30s no progress at version 2488419 (avg 7.30s) [limit 15].
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on d1bf834728a0cf166d993f4728dfca54f3086fb0 ==> 8a7130cf1198560ea4b25e01d1cfdf373c526e04

Compatibility test results for d1bf834728a0cf166d993f4728dfca54f3086fb0 ==> 8a7130cf1198560ea4b25e01d1cfdf373c526e04 (PR)
Upgrade the nodes to version: 8a7130cf1198560ea4b25e01d1cfdf373c526e04
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1490.35 txn/s, submitted: 1491.68 txn/s, failed submission: 1.33 txn/s, expired: 1.33 txn/s, latency: 2450.29 ms, (p50: 2100 ms, p90: 4200 ms, p99: 5700 ms), latency samples: 112200
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1254.58 txn/s, submitted: 1256.62 txn/s, failed submission: 2.04 txn/s, expired: 2.04 txn/s, latency: 2593.10 ms, (p50: 2400 ms, p90: 4200 ms, p99: 6000 ms), latency samples: 110800
5. check swarm health
Compatibility test for d1bf834728a0cf166d993f4728dfca54f3086fb0 ==> 8a7130cf1198560ea4b25e01d1cfdf373c526e04 passed
Upgrade the remaining nodes to version: 8a7130cf1198560ea4b25e01d1cfdf373c526e04
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1305.68 txn/s, submitted: 1308.21 txn/s, failed submission: 2.53 txn/s, expired: 2.53 txn/s, latency: 2438.40 ms, (p50: 2100 ms, p90: 3900 ms, p99: 5600 ms), latency samples: 113360
Test Ok

@vusirikala vusirikala merged commit 0eda433 into main Aug 24, 2024
48 checks passed
@vusirikala vusirikala deleted the satya/reduce_backpressure branch August 24, 2024 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants