Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppress p2p logs #1875

Merged
merged 9 commits into from
Aug 1, 2019
Merged

Suppress p2p logs #1875

merged 9 commits into from
Aug 1, 2019

Conversation

abitmore
Copy link
Member

PR for #978.

This PR reduces logging level of some common exceptions, greatly reduce size of p2p log files.

Current implementation doesn't address the txid mismatch issue mentioned in #978 (comment).

@abitmore abitmore added this to the 3.3.0 - Feature Release milestone Jul 27, 2019
Some bots place limit orders then try to cancel some of them before a
new block arrives. However, the order's id (in pending state) may be
different in individual nodes, thus caused the exceptions.

A potential solution to the order_id inconsistent issue:
#556
@abitmore abitmore force-pushed the pr-978-p2p-logging branch from 2fc099d to eea889d Compare July 27, 2019 18:06
@abitmore
Copy link
Member Author

The next large message in p2p log is

2019-07-28T21:00:00 p2p:message read_loop on_closing_connectio ] Peer xx.xx.xx.xx:xxxxx is disconnecting us because of an error: You offered us a block that we reject
as invalid, exception: { ... "Rejecting block with timestamp in t
he future" ...

which followed with a JSON dump of a whole block due to the capture and re-throw of blk_msg in application_impl::handle_block():

} FC_CAPTURE_AND_RETHROW( (blk_msg)(sync_mode) ) return false; }

The reason why this message appears in a normal node's log file, is when a normal node has a peer whose system clock is behind, the "stale" node may repeatedly reject / disconnect to the normal node and re-connect, the exception message with the whole block will be sent to the normal node when disconnecting. One stale node can generate 200+ such messages in normal nodes' log files every hour, which is 2-3 MB with recent blocks (depends on size of blocks). The more stale nodes connected, the more the messages, thus the larger the log files.

On the other hand, the message could be helpful for nodes whose system clock is ahead.

I'm not sure what's the best way to deal with this. My gut says we'd better send shorter message when disconnecting due to this reason, E.G. only includes the block header but not the whole block.

Thoughts?

@pmconrad
Copy link
Contributor

Agree to send only message header and block header in this case. Returning the full block is waste of bandwidth + disk space.

@jmjatlanta
Copy link
Contributor

The code looks good. Any suggestions on how to effectively and easily test your changes?

Side note: Do we need an appender that could help us with testing (i.e. appends to a vector)?

@abitmore
Copy link
Member Author

@jmjatlanta one of my mainnet public API nodes is running the patched code.

Some misconfigured bots were trying to place limit orders in
unauthorized markets, or place orders when don't have sufficient
balance (this could also be caused by inconsistent pending state).
case 3050105: // limit_order_create_receiving_asset_unauthorized
case 3050106: // limit_order_create_insufficient_balance
case 3050201: // limit_order_cancel_nonexist_order
case 3050202: // limit_order_cancel_owner_mismatch
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this code (a hard-coded list of magic numbers) is a bit ugly. Any idea to improve?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does using e. g. limit_order_cancel_nonexist_order::code_enum::code_value work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it works.

But I still think hard-coding a list here is not very good, since we may need to add more items later. It would be better if we can make it somewhat configurable, e.g. can define the log level when throwing the exception, or use a config file, etc. Perhaps not worth the efforts though.

@abitmore
Copy link
Member Author

New log messages about "timestamp-in-the-future" blocks look like this:

2019-07-30T20:44:39 p2p:message read_loop on_closing_connectio ] Peer 127.0.0.1:39889 is disconnecting us because of an error: You offered us a block that we reject as invalid, exception: {"code":90007,"name":"block_timestamp_in_future_exception","message":"block timestamp in the future","stack":[{"context":{"level":"warn","file":"node.cpp","line":2589,"method":"send_sync_block_to_node_delegate","hostname":"","thread_name":"p2p","timestamp":"2019-07-30T20:44:39"},"format":"","data":{"block_header":{"previous":"00000014aebc77a146c7e989ea4cc3f4f6fe3b58","timestamp":"2019-07-30T20:44:45","witness":"1.6.3","transaction_merkle_root":"0000000000000000000000000000000000000000","extensions":[]},"block_num":21,"block_id":"00000015a564b87c8fffa7c442d8dee508e1ddc1"}}]} node.cpp:2432

Average size of such messages reduced from around 12KB (data of 2019-07-29) to 800 bytes.

Copy link
Contributor

@pmconrad pmconrad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants