This repository has been archived by the owner on Oct 28, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Avoid attempting to sync with disconnected peers #5644
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There are a few places where we retrieve a session (stored in Host::m_sessions) but we don't check if the session is connected before using the session ptr. We need to do this because sessions are kept alive by shared_ptrs (shared_from_this) passed to boost ASIO handlers, and there are conditions which can occur where a session shared_ptr is still valid but the session has been disconnected. For example, take the case of if we've requested a Session disconnect (Session::disconnect) but the handler which executes after the socket write occurs (see Session::write) is sitting in the boost ASIO handler queue.
Codecov Report
@@ Coverage Diff @@
## master #5644 +/- ##
==========================================
- Coverage 62.74% 62.73% -0.02%
==========================================
Files 348 348
Lines 29682 29689 +7
Branches 3343 3345 +2
==========================================
Hits 18625 18625
- Misses 9845 9848 +3
- Partials 1212 1216 +4 |
Is this ready? |
halfalicious
changed the title
[WIP] Check if session is connected after obtaining shared_ptr
Check if session is connected after obtaining shared_ptr
Jul 2, 2019
halfalicious
requested review from
chfast and
gumb0
and removed request for
chfast
July 2, 2019 02:55
@gumb0 Yup! 😄 |
halfalicious
changed the title
Check if session is connected after obtaining shared_ptr
Avoid attempting to sync with disconnected peers
Jul 2, 2019
chfast
reviewed
Jul 2, 2019
libp2p/Host.cpp
Outdated
if (s && s->isConnected()) | ||
return s; | ||
} | ||
return{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, still need to get git clang-format up and running again after a reinstall, will reformat manually for the time-being.
libp2p/Host.cpp
Outdated
@@ -1155,13 +1174,16 @@ void Host::forEachPeer( | |||
RecursiveGuard l(x_sessions); | |||
vector<shared_ptr<SessionFace>> sessions; | |||
for (auto const& i : m_sessions) | |||
if (shared_ptr<SessionFace> s = i.second.lock()) | |||
{ | |||
shared_ptr<SessionFace> s = i.second.lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use auto
as in other cases?
Remove redundant search, fix formatting, use auto consistently
gumb0
approved these changes
Jul 2, 2019
chfast
approved these changes
Jul 2, 2019
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix #5643
Fix a bug where Aleth can try to sync with a node after it has failed status validation (e.g. its on a different chain), which results in wasted resources and confusing logs e.g: "starting full sync" being displayed but no blocks being imported.
The root cause is that Aleth can initiate syncing with peers via
Host::forEachPeer
(seeBlockChainSync::continueSync
), andHost::forEachPeer
iterates through its map of Session weak_ptrs without checking if the session is still connected i.e. the socket isn't closed. This means that the following case can occur:Status packet validation fails (see
BlockChainSync::onPeerStatus
) so Aleth sends a disconnect packet to the remote node (Host::capabilityHost()::disconnect()
). This results inSession::disconnect()
being called which results in a write to the socket being queued and the write handler being placed in the boost asio handler queue (seeSession::write
). The socket is then closed. Note that the Session instance is still active until the write is completed and the handler is executed (Session::shared_from_this()
is captured in the handler).Something happens which triggers a call to
BlockChainSync::continueSync
(e.g. a peer session with another node is closed - seeBlockChainSync::onPeerAborting
) -BlockChainSync::continueSync
makes sync calls for each peer viahost().capabilityHost().forEachPeer()
. Meanwhile the write handler is still sitting in the boost ASIO handler queue so the session is still active.Eventually the write is performed and the write handler is removed from the queue and executed and the Session instance is destructed. This means that the attempted syncing with the disconnected node is wasted cycles and also makes the logs confusing.