-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Various network-related logging changes #5605
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5605 +/- ##
==========================================
- Coverage 62.33% 62.33% -0.01%
==========================================
Files 350 350
Lines 29428 29441 +13
Branches 3322 3321 -1
==========================================
+ Hits 18344 18351 +7
- Misses 9886 9891 +5
- Partials 1198 1199 +1 |
@gumb0 : I've removed the I don't think we can automatically include this information via the new boost logger The only other option I can think of is to manually include the node ID in each [Edit] Whatever approach we end up using will also have to be used in EthereumCapability / WarpCapability and perhaps other classes in those code paths e.g. EthereumPeerObserver. |
No ideas how to make it more elegant, but still - almost all of the code of |
Also in these higher-level classes ( |
f7abdb1
to
7269bf9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks more/less ok.
libp2p/RLPxHandshake.h
Outdated
@@ -90,8 +90,8 @@ class RLPXHandshake: public std::enable_shared_from_this<RLPXHandshake> | |||
void readAckEIP8(); | |||
|
|||
/// Closes connection and ends transitions. | |||
void error(); | |||
|
|||
void error(boost::system::error_code _ech = boost::system::error_code()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void error(boost::system::error_code _ech = boost::system::error_code()); | |
void error(boost::system::error_code _ech = {}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm now I'm thinking LOG_SCOPED_CONTEXT
or similar thing might have been useful, at least when there're many logging statements in one function, and instead of repeating << peerID
we could use LOG_SCOPED_CONTEXT
once at the top of the function.
But maybe some messages would become less nice (the ones where now you output << peerID
in the middle of the message), so I don't know
libethereum/BlockChainSync.cpp
Outdated
@@ -379,7 +380,10 @@ void BlockChainSync::requestBlocks(NodeID const& _peerID) | |||
} | |||
} | |||
else | |||
m_host.peer(_peerID).requestBlockHeaders(start, 1, 0, false); | |||
{ | |||
LOG(m_loggerDetail) << "Requesting block headers from " << _peerID; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is logged already in EthereumPeer
, do we need another message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good point, will remove
@gumb0 : Yea, I thought of that but I'm not exactly sure how I can take a look at using it after I've wrapped up some of the higher priority work on my plate e.g. our peering vs Geth and syncing getting stuck. What do you think? |
I'm working on debugging another issue and I noticed that aleth/libethereum/BlockChainSync.cpp Lines 227 to 245 in b094d0c
@gumb0 : Should I convert these logging calls to use clog or define global thread-safe loggers for the "sync" channel (in Log.h)? I'm leaning towards the latter since clog() calls are a bit verbose, but not sure if it makes sense to define global loggers which will only be used in a single file? Or perhaps I could address this by defining them in BlockChainSync.cpp? |
Yeah it's fine to leave it like this for now. |
Is it because of this single call? aleth/libethereum/BlockChainSync.cpp Lines 147 to 152 in b094d0c
If so, can we use here Then we'll avoid the mess with loggers, and it might be a good idea for thread-safety in general. |
But if it's ever needed, it's also possible to create non-global but thread-safe loggers. For that you'd need another function like this Lines 124 to 129 in b094d0c
but with severity_channel_logger_mt (multi-threaded)
|
@gumb0 : Yes, I think that's the only place where Perhaps I should post the entire |
Yeah, good idea to move it all to network thread, though I think mutex will be still needed. I think other calls to |
@gumb0 : Ah I didn't realize that BlockChainSync::status() might call it from another thread but you're right, it can be called from the Client thread - agreed then, keeping the mutex makes sense. |
@gumb0 I thought about this a little more and since my changes don't impact the thread safety of |
RLPxHandshake suffix is the remote node information. This enables us to make the RLPX handshake log messages more compact since we aren't manually including remote node information in each log statement.
Also add remote node information as log suffix to new Session member loggers. Note that member loggers aren't thread-safe, but Session functions are only called from the network thread so using member loggers should be ok (and is more performant).
We are no longer including the peer client information in each BlockChainSync log message (since it bloats the logs). However, this information can be useful for debugging purposes and the initial peering could have happened long in the past, so we promote the PeerSessionInfos log message from trace to debug verbosity so that this information is more available in logs for debugging purposes.
Logs are now written to new "warpcap" log channel and are logged via class instance loggers rather than via global thread-safe loggers. Instance loggers are more performant and we don't need thread-safety since logs are only written in functions executed on the network thread. Also make some minor modifications to format of existing log messages.
Messages were logged via global thread-safe loggers - we don't need the performance overhead of thread-safe loggers since messages are logged in functions executed on the network thread. As a bonus, these messages are now also logged to the "ethcap" channel rather than the "net" channel, which makes sense since these messages are logged during execution of EthereumCapability functionality.
Logs were written to global thread-safe loggers, which aren't required (and which are less performant than their non-thread-safe counterparts) since the functions in which the logs are written are only executed on the network thread. Also make some minor formatting modifications to some existing log messages.
Use {} initializers and remove unnecessary log message in BlockChainSync::requestBlocks
Move some log messages from error to warning verbosity, make boost log attribute configuration more compact, and use m_id.hex() rather than first writing the value to a stream.
e8a7f36
to
e7dc0b0
Compare
Rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several more comments. Sorry, but this PR is quite huge, I keep discovering new stuff in it.
libethereum/WarpCapability.cpp
Outdated
@@ -455,12 +457,13 @@ bool WarpCapability::interpretCapabilityPacket(NodeID const& _peerID, unsigned _ | |||
} | |||
catch (Exception const&) | |||
{ | |||
cnetlog << "Warp Peer causing an Exception: " | |||
<< boost::current_exception_diagnostic_information() << " " << _r; | |||
LOG(m_loggerError) << "Warp Peer " << _peerID << " causing an exception: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In EthreumCapability
these are WARNING level now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nice catch
libp2p/Session.cpp
Outdated
|
||
if (!checkRead(tlen, ec, length)) | ||
return; | ||
else if (!m_io->authAndDecryptFrame(bytesRef(m_data.data(), tlen))) | ||
{ | ||
cnetlog << "frame decrypt failed"; | ||
LOG(m_netLoggerError) << "Frame decrypt failed"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't log every invalid packet to ERROR
@@ -380,7 +379,9 @@ void Session::doRead() | |||
RLP r(frame.cropped(1)); | |||
bool ok = readPacket(hProtocolId, packetType, r); | |||
if (!ok) | |||
cnetlog << "Couldn't interpret packet. " << RLP(r); | |||
LOG(m_netLoggerError) | |||
<< "Couldn't interpret " << p2pPacketTypeToString(packetType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, too, I don't think it's for ERROR
@@ -406,7 +407,9 @@ bool Session::checkRead(std::size_t _expected, boost::system::error_code _ec, st | |||
{ | |||
// with static m_data-sized buffer this shouldn't happen unless there's a regression | |||
// sec recommends checking anyways (instead of assert) | |||
cnetlog << "Error reading - TCP read buffer length differs from expected frame size."; | |||
LOG(m_netLoggerError) | |||
<< "Error reading - TCP read buffer length differs from expected frame size (" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also seems too low-level for ERROR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gumb0 The comment above the log statement gives me the impression that this should never happen which is why I chose to log this as error (since if it does happen we want to know about it).
libp2p/Session.cpp
Outdated
@@ -434,8 +437,8 @@ bool Session::canHandle( | |||
|
|||
void Session::disableCapability(std::string const& _capabilityName, std::string const& _problem) | |||
{ | |||
cnetdetails << "DISABLE: Disabling capability '" << _capabilityName | |||
<< "'. Reason: " << _problem; | |||
LOG(m_netLoggerDetail) << "DISABLE: Disabling capability '" << _capabilityName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be useful on DEBUG
No worries at all! I know these changes probably aren't very interesting so I really appreciate the in-depth review 😄 |
4fbfa0c
to
2dcbeb3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor comments, ok to merge when addressed
Changed public functions of Session to use thread-safe logger since they can potentially be called from any thread. Also demote some network error messages from error to debug and make minor modifications to some log messages.
2dcbeb3
to
fbad1f7
Compare
RLPXHandshake
,Session
, andEthereumPeer
log messages which is comprised of either the remote node ID and endpoint or just the remote node id (to remove the need of having to manually include this information in each log message). This suffix also removes the need for theLOG_SCOPED_CONTEXT
macro so I've removed it as wellBlockChainSync
,EthereumCapability
, andWarpCapability
classes (since these classes no longer automatically get peer client information in their log messages due to the removal ofLOG_SCOPED_CONTEXT
inSession
)RLPXHandshake
log messages related to receiving specific messages from a remote node from before the read (which reads the message bytes from the socket) to after the read so identifying issues related to non-responsive nodes is more intuitiveRLPXHandshake
Here's a snippet of the new logs: