initial prometheus metrics #1536

ordian · 2020-08-04T13:34:52Z

Part of #1482.

Number of statements signed
Number of bitfields signed
Number of availability chunks received
Number of candidates seconded
Number of collations generated
Number of Runtime API errors encountered

* master: Unalias Substrate Imports (#1530) Rewrite client handling (#1531) Integrate ChainApi with all messages (#1533) Add Rococo test network (#1363) Fix a typo parathreads -> parachains (#1529) Cleanup upcoming paras (#1527) Sudo wrapper for paras (#1517) Implementer's guide: notes on contextual execution (#1525) Companion for substrate/6782 (#1523) Sort out validation errors (#1516)

* master: Network Bridge Refactoring (#1535) Use DNS hostnames for westend bootnodes (#1528) Companion PR: add weightinfo for democracy (#1522)

* master: [CI] Fix draft release publishing (#1546) Bump Substrate, version (#1541) Update check_labels runtimenoteworthy label (#1540) revert enabling authority discovery by default (#1532)

* master: Companion PR to delaying network startup to after initialization (#1547) Add SyncOracle to network's Service (#1543) Ignore checks for companion PRs (#1455) Add an Origin to parachains v1 (#1542)

…trics * origin/gav-upsub: Bumb substrate again Bump Substrate

* master: Bump Substrate (#1548)

* master: implement provisioner (#1473) Implementer's guide: downward messages and HRMP, take 2 (#1503) Revert "Ignore checks for companion PRs (#1455)" (#1549)

mxinden

Looks good to me overall. I have a couple smaller comments.

Regarding naming, I've read https://prometheus.io/docs/practices/naming/, but it's not clear if all metrics should have a prefix like "parachains_" (I guess there is namespacing in prometheus?).

I think prefixing each parachain related metric with parachain is a good idea. Same is done for Substrate networking with sub_libp2p.

node/core/av-store/src/lib.rs

node/core/backing/src/lib.rs

node/network/statement-distribution/src/lib.rs

node/overseer/src/lib.rs

mxinden · 2020-08-12T13:03:07Z

node/subsystem/src/lib.rs

@@ -195,6 +197,9 @@ pub trait SubsystemContext: Send + 'static {
 /// [`Overseer`]: struct.Overseer.html
 /// [`Subsystem`]: trait.Subsystem.html
 pub trait Subsystem<C: SubsystemContext> {
+	/// Subsystem-specific prometheus metrics.
+	type Metrics: metrics::Metrics;


👍 for making Metrics a core component of Subsystems. I think this is a very clean approach.

In the future we should probably have the discussion whether our way of recording metrics should trickle down further than the level of Subsystem or whether the core logic should stay generic. E.g. sc-network handles all metric recording within NetworkWorker. Lower level components bubble up the data needed to feed these metrics. On the other hand finality-grandpa passes down a metrics registry all the way to bottom components.

node/subsystem/src/lib.rs

ordian · 2020-08-12T13:36:21Z

Another thing that was mentioned in #1482 is alerts, this can be done in a separate PR as well.

mxinden · 2020-08-14T07:01:29Z

Another thing that was mentioned in #1482 is alerts, this can be done in a separate PR as well.

Yes, agree that this can happen in another pull request. You can take a look at Substrate alerting-rules.yaml and the corresponding test file as an example.

node/core/candidate-validation/src/lib.rs

node/core/chain-api/src/lib.rs

node/core/provisioner/src/lib.rs

node/core/runtime-api/src/lib.rs

node/subsystem/src/lib.rs

Co-authored-by: Max Inden <mail@max-inden.de>

mxinden

Looks good from a metrics point of view. Thanks for the work @ordian!

I am not too familiar with the subsystem system, thus would anyone more familiar be able to take a look as well?

* master: Make parachain validation wasm executor functional (#1574) Use async test helper to simplify node testing (#1578) guide: validation data refactoring (#1576) Remove v0 parachains runtime (#1501) [CI] Add github token to generate-release-text (#1581) Allow using any polkadot client instead of enum Client (#1575) service/src/lib: Update authority discovery construction (#1563) Update .editorconfig to what we have in practice (#1545) Companion PR for substrate #6672 (#1560) pre-redenomination tockenSymbol change (#1561)

* master: Companion PR for #6862 (#1564) implement collation generation subsystem (#1557) Add spawn_blocking to SubsystemContext (#1570) Companion PR for #6846 (#1568) overseer: add a test to ensure all subsystem receive msgs (#1590) Implementer's Guide: Flesh out more details for upward messages (#1556)

node/collation-generation/src/lib.rs

* master: overseer: fix build (#1596)

ordian · 2020-08-17T18:13:33Z

node/collation-generation/src/lib.rs

-		let subsystem = CollationGenerationSubsystem { config: None, metrics: self.metrics };
-
-		let future = Box::pin(subsystem.run(ctx));
+		let future = Box::pin(self.run(ctx));


@coriolinus this is safe since there is no way to construct a subsystem with Some(config).

montekki · 2020-08-18T07:17:43Z

node/overseer/src/lib.rs

+/// Overseer Prometheus metrics.
+#[derive(Clone)]
+struct MetricsInner {
+	activated_heads_total: prometheus::Counter<prometheus::U64>,
+	deactivated_heads_total: prometheus::Counter<prometheus::U64>,
+}


In the overseer I would probably also account for things such as number of messages relayed between subsystems.

ordian · 2020-08-18T09:18:15Z

I'll merge the PR as is and comment in the issue on the remaining metrics.

bot merge

…n-race-condition * master: initial prometheus metrics (#1536) Companion for Substrate 6868 (WeightInfo for System, Utility, and Timestamp) (#1606) Promote `HrmpChannelId` and supply more docs (#1595) move AssignmentKind and CoreAssigment to scheduler (#1571) overseer: fix build (#1596)

ordian added 2 commits August 4, 2020 13:56

service-new: cosmetic changes

5f5ee13

overseer: draft of prometheus metrics

b6e221d

ordian added A3-in_progress Pull request is in progress. No review needed at this stage. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Aug 4, 2020

ordian and others added 25 commits August 4, 2020 15:36

metrics: update active_leaves metrics

635b861

Merge branch 'master' into ao-prometheus-metrics

e9b4ee0

* master: Network Bridge Refactoring (#1535) Use DNS hostnames for westend bootnodes (#1528) Companion PR: add weightinfo for democracy (#1522)

metrics: extract into functions

2e72851

metrics: resolve XXX

e9a9e99

metrics: it's ugly, but it works

7fac800

Merge branch 'master' into ao-prometheus-metrics

844783a

* master: [CI] Fix draft release publishing (#1546) Bump Substrate, version (#1541) Update check_labels runtimenoteworthy label (#1540) revert enabling authority discovery by default (#1532)

Bump Substrate

b0695da

Merge branch 'master' into ao-prometheus-metrics

bdc89ca

* master: Companion PR to delaying network startup to after initialization (#1547) Add SyncOracle to network's Service (#1543) Ignore checks for companion PRs (#1455) Add an Origin to parachains v1 (#1542)

metrics: move a bunch of code around

7d0f6b4

Bumb substrate again

f066dd6

Merge remote-tracking branch 'origin/gav-upsub' into ao-prometheus-me…

aa1f323

…trics * origin/gav-upsub: Bumb substrate again Bump Substrate

metrics: fix a warning

b255fe4

fix a warning in runtime

f6edf36

Merge branch 'master' into ao-prometheus-metrics

8303724

* master: Bump Substrate (#1548)

metrics: statements signed

22d3a06

metrics: statements impl RegisterMetrics

984068b

metrics: refactor Metrics trait

3983563

metrics: add Metrics assoc type to JobTrait

688ade3

Merge branch 'master' into ao-prometheus-metrics

b1680b6

* master: implement provisioner (#1473) Implementer's guide: downward messages and HRMP, take 2 (#1503) Revert "Ignore checks for companion PRs (#1455)" (#1549)

metrics: move Metrics trait to util

62f4605

metrics: fix overseer

7baf6f5

metrics: fix backing

94008e4

metrics: fix candidate validation

92230d5

metrics: derive Default

2c56906

utils: add a comment for job metrics

3cb35db

mxinden reviewed Aug 12, 2020

View reviewed changes

ordian added 3 commits August 12, 2020 15:26

metrics: address review comments

94ece8b

metrics: oops

4dc91a4

metrics: make sure to save files before commit 😅

f660101

mxinden reviewed Aug 14, 2020

View reviewed changes

use _total suffix for requests metrics

f046511

Co-authored-by: Max Inden <mail@max-inden.de>

mxinden approved these changes Aug 14, 2020

View reviewed changes

ordian added 2 commits August 14, 2020 12:45

metrics: add tests for overseer

1f78b33

rphmeier requested a review from montekki August 17, 2020 11:11

ordian added 4 commits August 17, 2020 14:37

update Cargo.lock

531c40c

overseer: add a test for CollationGeneration

483d8ab

collation-generation: impl metrics

62b5066

ordian commented Aug 17, 2020

View reviewed changes

node/collation-generation/src/lib.rs Show resolved Hide resolved

ordian commented Aug 17, 2020

View reviewed changes

node/collation-generation/src/lib.rs Show resolved Hide resolved

ordian added 3 commits August 17, 2020 19:54

Merge branch 'master' into ao-prometheus-metrics

35713db

* master: overseer: fix build (#1596)

collation-generation: use kebab-case for name

87d5979

collation-generation: add a constructor

0206b79

ordian commented Aug 17, 2020

View reviewed changes

montekki approved these changes Aug 18, 2020

View reviewed changes

ordian merged commit 804958a into master Aug 18, 2020

ordian deleted the ao-prometheus-metrics branch August 18, 2020 09:18

ordian mentioned this pull request Sep 14, 2021

add dispute metrics, some chores #3842

Merged

ordian mentioned this pull request Sep 27, 2022

Member Request polkadot-fellows/seeding#18

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initial prometheus metrics #1536

initial prometheus metrics #1536

ordian commented Aug 4, 2020 •

edited

Loading

mxinden left a comment

mxinden Aug 12, 2020

ordian commented Aug 12, 2020

mxinden commented Aug 14, 2020

mxinden left a comment

ordian Aug 17, 2020

montekki Aug 18, 2020

ordian commented Aug 18, 2020

initial prometheus metrics #1536

initial prometheus metrics #1536

Conversation

ordian commented Aug 4, 2020 • edited Loading

mxinden left a comment

Choose a reason for hiding this comment

mxinden Aug 12, 2020

Choose a reason for hiding this comment

ordian commented Aug 12, 2020

mxinden commented Aug 14, 2020

mxinden left a comment

Choose a reason for hiding this comment

ordian Aug 17, 2020

Choose a reason for hiding this comment

montekki Aug 18, 2020

Choose a reason for hiding this comment

ordian commented Aug 18, 2020

ordian commented Aug 4, 2020 •

edited

Loading