Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EVM-675 Implement additional metrics #1564

Merged
merged 9 commits into from
Jun 7, 2023

Conversation

goran-ethernal
Copy link
Collaborator

@goran-ethernal goran-ethernal commented May 29, 2023

Description

Added additional metrics to Edge codebase:

Consensus metrics:

  • consensus block_execution_time - measures the time of block execution.
  • consensus block_building_time - measures the block building time.
  • consensus block_space_used - measures the gas used by block.

Network metrics:

  • network egress_bytes - measures the number of bytes sent over the P2P network in the gossip protocol.
  • network ingress_bytes - measures the number of bytes received over the P2P network in the gossip protocol.
  • network bad_messages - counter of bad messages received over the P2P network in the gossip protocol.

TX Pool Metrics:

  • txpool slots_used - amount of slots currently occupying the pool measured through time.
  • txpool invalid_tx_type - counter of transactions that had invalid tx type.
  • txpool oversized_data_txs - counter of transactions that had size more then allowed transaction max size.
  • txpool negative_value_tx - counter of transactions that had negative Value field.
  • txpool invalid_signature_txs - counter of transactions with invalid signature.
  • txpool invalid_sender_txs - counter of transactions with invalid sender.
  • txpool contract_deploy_too_large_txs - counter of contract deployment transactions where contract size is too large.
  • txpool underpriced_tx - counter of underpriced transactions.
  • txpool fee_cap_too_high_dynamic_tx - counter of dynamic (EIP-1559) transactions whose fee cap is too high.
  • txpool tip_too_high_dynamic_tx - counter of dynamic (EIP-1559) transactions whose gas tip cap is too high.
  • txpool tip_above_fee_cap_dynamic_tx - counter of dynamic (EIP-1559) transactions whose gas tip cap is above gas fee cap.
  • txpool nonce_too_low_tx - counter of transactions whose nonce was too low.
  • txpool invalid_account_state_tx - counter of transactions whose account state was invalid.
  • txpool insufficient_funds_tx - counter of transactions whose sender did not have sufficient funds to execute them.
  • txpool invalid_intrinsic_gas_tx - counter of transactions whose transaction gas cost could not be calculated.
  • txpool intrinsic_gas_low_tx - counter of transactions whose gas is lower then calculated transaction gas cost.
  • txpool block_gas_limit_exceeded_tx - counter of transactions whose gas cost exceeds the block gas limit.
  • txpool rejected_future_tx - counter of rejected transaction s that have a future nonce.
  • txpool already_known_tx - counter of transactions who were not added to the pool because they are already known transactions.
  • txpool added_tx - measures the added transactions through time.

JSON RPC metrics:

Each JSON RPC endpoint function has a metric that measures its execution time.
Each JSON RPC endpoint function has a metric that measures the number of errors that occurred in their execution (counter of all errors).

For example:
json_rpc eth_getBlockByNumber_time - measures the execution time of eth_getBlockByNumber endpoint.
json_rpc eth_getBlockByNumber_errors - measures the number of errors that happened in calls to given endpoint.

Changes include

  • Bugfix (non-breaking change that solves an issue)
  • Hotfix (change that solves an urgent issue, and requires immediate attention)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (change that is not backwards-compatible and/or changes current functionality)

Checklist

  • I have assigned this PR to myself
  • I have added at least 1 reviewer
  • I have added the relevant labels
  • I have updated the official documentation
  • I have added sufficient documentation in code

Testing

  • I have tested this code with the official test suite
  • I have tested this code manually

@goran-ethernal goran-ethernal self-assigned this May 29, 2023
@goran-ethernal goran-ethernal added the feature New update to Polygon Edge label May 29, 2023
@goran-ethernal goran-ethernal requested review from a team and praetoriansentry May 29, 2023 13:11
@goran-ethernal goran-ethernal marked this pull request as ready for review May 29, 2023 13:11
Copy link
Collaborator

@Stefan-Ethernal Stefan-Ethernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We haven't implemented these metrics, related to Level DB:

Resource Type Metric Name
DB Response Trie Read/Write/Commit Times
DB Utilization leveldb IOPs
DB Response leveldb IO Latency
DB Response Compaction time

Also not sure if this can be deduced from one of the provided metrics:

Resource Type Metric Name
P2P Utilization Messages (TX, Headers, Bodies) per second

consensus/polybft/blockchain_wrapper.go Outdated Show resolved Hide resolved
consensus/polybft/blockchain_wrapper.go Outdated Show resolved Hide resolved
jsonrpc/types.go Show resolved Hide resolved
txpool/txpool.go Outdated Show resolved Hide resolved
txpool/txpool.go Outdated Show resolved Hide resolved
txpool/txpool.go Show resolved Hide resolved
@goran-ethernal
Copy link
Collaborator Author

We haven't implemented these metrics, related to Level DB:

Resource Type Metric Name
DB Response Trie Read/Write/Commit Times
DB Utilization leveldb IOPs
DB Response leveldb IO Latency
DB Response Compaction time
Also not sure if this can be deduced from one of the provided metrics:

Resource Type Metric Name
P2P Utilization Messages (TX, Headers, Bodies) per second

As John said, those are just proposals. Implementing those leveldb metrics is not easy, and I am not sure how will they impact performance if we are to measure it by second. As per P2P metrics, yes, they can see it on DataDog, since the graph is representing measured values in time. You can zoom in basically to a second on a graph and see how many messages you received in that time.

syncer/syncer.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@Stefan-Ethernal Stefan-Ethernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we should document somewhere on the wiki all the available metrics.

Copy link
Contributor

@vcastellm vcastellm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Stefan-Ethernal Stefan-Ethernal force-pushed the EVM-675-implement-metrics-proposed-by-john branch from d5f4387 to ba3a2d9 Compare June 6, 2023 07:15
@goran-ethernal goran-ethernal merged commit ef33b05 into develop Jun 7, 2023
7 checks passed
@goran-ethernal