Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log initial peer connection failures #1763

Merged
merged 1 commit into from
Feb 17, 2021

Conversation

teor2345
Copy link
Contributor

@teor2345 teor2345 commented Feb 17, 2021

Motivation

While testing #1762, I noticed that Zebra appears to hang if peers take a long time to connect.

What is actually happening is that peers time out, but those timeouts aren't logged.

Feb 17 10:44:30.545  INFO {zebrad="d524da7a" net="Main"}: zebrad::commands::start: Starting zebrad
Feb 17 10:44:30.546  INFO {zebrad="d524da7a" net="Main"}: zebrad::commands::start: config=ZebradConfig { consensus: Config { checkpoint_sync: false }, metrics: MetricsSection { endpoint_addr: None }, network: Config { listen_addr: 0.0.0.0:8233, network: Mainnet, initial_mainnet_peers: {"dnsseed.str4d.xyz:8233", "dnsseed.z.cash:8233", "mainnet.seeder.zfnd.org:8233", "mainnet.is.yolo.money:8233"}, initial_testnet_peers: {"testnet.seeder.zfnd.org:18233", "dnsseed.testnet.z.cash:18233", "testnet.is.yolo.money:18233"}, peerset_initial_target_size: 50, new_peer_interval: 60s }, state: Config { cache_dir: "/home/dev/.cache/zebra", ephemeral: false, debug_stop_at_height: None }, tracing: TracingSection { use_color: true, filter: None, endpoint_addr: None, flamegraph: None }, sync: SyncSection { max_concurrent=_block_requests: 50, lookahead_limit: 2000 } }
Feb 17 10:44:30.546  INFO {zebrad="d524da7a" net="Main"}: zebrad::commands::start: initializing node state
Feb 17 10:44:30.546  INFO {zebrad="d524da7a" net="Main"}: zebra_state::config: the open file limit is at or above the specified limit new_limit=1024 current_limit=1024 hard_rlimit=Some(524288)
Feb 17 10:44:30.553  INFO {zebrad="d524da7a" net="Main"}: zebra_state::service::finalized_state: Opened Zebra state cache at /home/dev/.cache/zebra/state/v4/mainnet
Feb 17 10:44:30.554  INFO {zebrad="d524da7a" net="Main"}: zebrad::commands::start: initializing chain verifier
Feb 17 10:44:30.570  INFO {zebrad="d524da7a" net="Main"}:init{config=Config { checkpoint_sync: false } network=Mainnet}: zebra_consensus::chain: initializing chain verifier tip=Some((Height(0), block::Hash("00040fe8ec8471911baa1db1266ea15dd06b4a8a5c453883c000b031973dce08"))) max_checkpoint_height=Height(419581)
Feb 17 10:44:30.570  INFO {zebrad="d524da7a" net="Main"}: zebrad::commands::start: initializing network
Feb 17 10:44:30.570  INFO {zebrad="d524da7a" net="Main"}: zebra_network::peer_set::initialize: Sending initial request for peers
Feb 17 10:44:30.570  INFO {zebrad="d524da7a" net="Main"}:listen{addr=0.0.0.0:8233}: zebra_network::peer_set::initialize: Opened Zcash protocol endpoint at 0.0.0.0:8233
Feb 17 10:44:30.571  INFO {zebrad="d524da7a" net="Main"}:add_initial_peers: zebra_network::peer_set::initialize: Connecting to initial peer set initial_peers={95.217.78.170:8233, 144.217.253.98:8233, 173.249.20.4:8233, 167.172.101.107:8233, 13.209.30.221:8233, 8.131.253.129:8233, [2a01:4f9:c010:a1e9::1]:8233, [2a01:4f8:140:2393::2]:8233, 172.104.180.119:8233, 138.201.252.11:8233, [2a01:4f8:201:8367::2]:8233, 94.23.220.214:8233, 47.254.176.240:8233, 45.79.223.173:8233, 194.135.81.61:8233, 138.201.120.49:8233, 59.110.243.62:8233, 46.4.65.10:8233, 75.152.21.54:8233, 118.24.175.14:8233, 37.59.57.96:8233, 91.206.16.214:8233, [2a01:4f8:120:72a3::2]:8233, 54.38.81.80:8233, [2001:41d0:8:cdd7::]:8233, 136.144.215.219:8233, 172.104.189.59:8233, 107.20.83.62:8233, 34.90.231.182:8233, [2a01:7c8:d002:10f:5054:ff:fe5c:dac7]:8233, [2a01:4f9:4a:1e15::2]:8233, 52.15.111.162:8233, 54.37.81.148:8233, 77.93.223.9:8233, 47.254.179.213:8233, 67.183.29.123:8233, 94.156.174.100:8233, 116.62.229.19:8233, 116.202.132.28:8233, 185.148.146.90:8233, 47.90.248.68:8233, 195.201.110.220:8233, 135.181.20.223:8233, 148.251.22.104:8233, 161.97.68.118:8233, 209.141.47.197:8233, 47.253.0.14:8233, 151.80.96.105:8233, [2a01:4f9:4b:42cb::2]:8233, [2a01:7e01::f03c:92ff:fea1:ec5]:8233, 165.22.76.97:8233, 8.210.14.154:8233, 213.136.78.68:8233, 209.145.48.22:8233, [2001:41d0:203:2f50::]:8233, 172.104.127.76:8233, 198.187.30.208:8233, 8.131.230.189:8233, [2001:41d0:303:d64b::]:8233, 161.53.201.188:8233, 85.15.179.171:8233, 147.135.65.99:8233,3.91.227.122:8233, 3.92.60.219:8233, 139.162.138.168:8233, 95.216.46.146:8233, 135.181.49.198:8233, 39.97.242.143:8233, 47.254.178.251:8233, 3.112.252.179:8233, 120.78.77.73:8233, [2605:6400:20:cd0:e56e:e74:bb7:a80]:8233, 149.56.135.165:8233, [2001:41d0:700:223b::]:8233, [2001:41d0:2:5fd6::1]:8233, 116.202.13.16:8233, 161.35.171.22:8233, [2001:41d0:8:ce33::]:8233, [2a01:4f8:b0:920a::5502]:8233, 116.202.129.35:8233, 93.51.53.177:8233, 8.140.105.213:8233, 195.201.11.54:8233, 47.90.244.88:8233, 35.198.119.137:8233}

Solution

Log initial peer connection failures
And standardise another log message

The logs are a bit verbose, but they only happen at startup, and they're better than just hanging with no output.

Review

This fix should be merged when we get time to review it, because we've actually seen this bug happen in CI.

And standardise another log message
@teor2345 teor2345 added C-bug Category: This is a bug A-rust Area: Updates to Rust code P-Low I-hang A Zebra component stops responding to requests labels Feb 17, 2021
@teor2345 teor2345 self-assigned this Feb 17, 2021
@teor2345 teor2345 added P-Medium and removed P-Low labels Feb 17, 2021
@dconnolly dconnolly added this to the 2021 Sprint 3 milestone Feb 17, 2021
@dconnolly dconnolly merged commit 8d1c498 into ZcashFoundation:main Feb 17, 2021
@teor2345 teor2345 mentioned this pull request Feb 23, 2021
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rust Area: Updates to Rust code C-bug Category: This is a bug I-hang A Zebra component stops responding to requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants