Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] feat: more TPS metrics #3147

Merged
merged 2 commits into from
Mar 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion node/consensus/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,11 @@ impl<N: Network> Consensus<N> {
impl<N: Network> Consensus<N> {
/// Adds the given unconfirmed solution to the memory pool.
pub async fn add_unconfirmed_solution(&self, solution: ProverSolution<N>) -> Result<()> {
#[cfg(feature = "metrics")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should probably add this to after the unconfirmed solutions and transmissions are added, since there are a few points where we fail or return early

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I did that, but @vicsn requested to move to top.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I saw that. I think if we want it at the top we'd have to add metrics flagged code to decrement the gauge on failure or early return, which probably isn't optimal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point is to see the difference with how many come in and how many actually make it into a block.

Copy link
Contributor

@miazn miazn Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, although in that case it would probably be better to track it from the delivering side, i.e. whatever we are sending. I think maybe that in this case, we are omitting metrics maybe one layer too deep in the code since what we really want is to omit metrics whenever these post endpoints are called. If we do want to just track how many times those endpoints are hit it might be cleaner to move it there- @vicsn what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rationale for putting it in add_unconfirmed_{transaction, solution} is so we also count transactions received via the router.

I think measuring it on the delivering side (i.e. tx-cannon) would be even better, but that requires more coordination and time to figure out.

{
metrics::increment_gauge(metrics::consensus::UNCONFIRMED_SOLUTIONS, 1f64);
metrics::increment_gauge(metrics::consensus::UNCONFIRMED_TRANSMISSIONS, 1f64);
}
// Process the unconfirmed solution.
{
let solution_id = solution.commitment();
Expand Down Expand Up @@ -265,6 +270,11 @@ impl<N: Network> Consensus<N> {

/// Adds the given unconfirmed transaction to the memory pool.
pub async fn add_unconfirmed_transaction(&self, transaction: Transaction<N>) -> Result<()> {
#[cfg(feature = "metrics")]
{
metrics::increment_gauge(metrics::consensus::UNCONFIRMED_TRANSACTIONS, 1f64);
metrics::increment_gauge(metrics::consensus::UNCONFIRMED_TRANSMISSIONS, 1f64);
}
// Process the unconfirmed transaction.
{
let transaction_id = transaction.id();
Expand Down Expand Up @@ -405,9 +415,14 @@ impl<N: Network> Consensus<N> {
let elapsed = std::time::Duration::from_secs((snarkos_node_bft::helpers::now() - start) as u64);
let next_block_timestamp = next_block.header().metadata().timestamp();
let block_latency = next_block_timestamp - current_block_timestamp;
let num_sol = next_block.solutions().len();
let num_tx = next_block.transactions().len();
let num_transmissions = num_tx + num_sol;

metrics::gauge(metrics::blocks::HEIGHT, next_block.height() as f64);
metrics::increment_gauge(metrics::blocks::TRANSACTIONS, next_block.transactions().len() as f64);
metrics::increment_gauge(metrics::blocks::SOLUTIONS, num_sol as f64);
metrics::increment_gauge(metrics::blocks::TRANSACTIONS, num_tx as f64);
metrics::increment_gauge(metrics::blocks::TRANSMISSIONS, num_transmissions as f64);
metrics::gauge(metrics::consensus::LAST_COMMITTED_ROUND, next_block.round() as f64);
metrics::gauge(metrics::consensus::COMMITTED_CERTIFICATES, num_committed_certificates as f64);
metrics::histogram(metrics::consensus::CERTIFICATE_COMMIT_LATENCY, elapsed.as_secs_f64());
Expand Down
Loading