Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Add a sub_libp2p_notifications_queues_size Prometheus metric #5503

Merged
merged 4 commits into from
Apr 3, 2020

Conversation

tomaka
Copy link
Contributor

@tomaka tomaka commented Apr 2, 2020

Reports on a Histogram, for each protocol, the queue size.

This means that if we send for example 2 messages, then we will report 2 then 1 then 0, or 2 then 0. Therefore the "sum" is rather meaningless, as when sending 2 messages it could be either 3 or 2.

Instead, the idea is to use the "buckets" feature of Prometheus to see the number of times where we reached a certain value.

One question is: should be really call observe() when we flush the queue? Should we not call it only when we push a message, so that the value represents the size that a queue had when we sent a message?

@tomaka tomaka added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes labels Apr 2, 2020
@tomaka tomaka requested a review from mxinden April 2, 2020 13:24
Copy link
Contributor

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Other than the observe on flushing none of my comments should block this pull request.

@@ -56,14 +57,17 @@ const INITIAL_KEEPALIVE_TIME: Duration = Duration::from_secs(5);
pub struct NotifsOutHandlerProto {
/// Name of the protocol to negotiate.
protocol_name: Cow<'static, [u8]>,
/// Optional prometheus histogram to report message queue sizes variations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Optional prometheus histogram to report message queue sizes variations.
/// Optional Prometheus histogram to report message queue size variations.

.map(|(p, _)| {
let queue_size_report = queue_size_report.as_ref().and_then(|qs| {
if let Ok(utf8) = str::from_utf8(&p) {
Some(qs.with_label_values(&[utf8]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for only passing a single Histogram.

@@ -103,6 +108,9 @@ pub struct NotifsOutHandler {
/// When the connection with the remote has been successfully established.
when_connection_open: Instant,

/// Optional prometheus histogram to report message queue sizes variations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Optional prometheus histogram to report message queue sizes variations.
/// Optional prometheus histogram to report message queue size variations.

@@ -259,6 +265,7 @@ impl<B: BlockT + 'static, H: ExHashT> NetworkWorker<B, H> {
params.block_announce_validator,
params.metrics_registry.as_ref(),
boot_node_ids.clone(),
metrics.as_ref().map(|m| m.notifications_queues_size.clone()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say we would like a Protocol to expose more metrics in the future. In that case wouldn't it make more sense to passs a &Registry down to Protocol::new and have it register its own Metrics struct?

This is a preference, not a requirement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My general idea about this code is that is would eventually completely disappear after #5481. I'm trying as much as possible to be fully predictable/deterministic and report everything that is worth notifying to the parent.

Poll::Pending | Poll::Ready(Ok(())) => {},
Poll::Pending | Poll::Ready(Ok(())) =>
if let Some(metric) = &self.queue_size_report {
metric.observe(substream.queue_len() as f64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question is: should be really call observe() when we flush the queue? Should we not call it only when we push a message, so that the value represents the size that a queue had when we sent a message?

I don't think we should call observe when flushing the queue. From the top of my head I don't see how that would help.

@tomaka tomaka merged commit 3e18235 into paritytech:master Apr 3, 2020
@tomaka tomaka deleted the prometheus-queue-sizes branch April 3, 2020 06:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants