Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAD-2371 Metrics for blocks lost due to switching #2321

Conversation

newhoggy
Copy link
Contributor

No description provided.

@newhoggy newhoggy changed the title Metrics for blocks gained and lost due to switching [CAD-2371] Metrics for blocks gained and lost due to switching Feb 1, 2021
@newhoggy newhoggy changed the title [CAD-2371] Metrics for blocks gained and lost due to switching CAD-2371 Metrics for blocks gained and lost due to switching Feb 1, 2021
@newhoggy newhoggy force-pushed the CAD-2371-metrics-for-blocks-gained-and-lost-due-to-switching branch from b1f6803 to b5c5938 Compare February 1, 2021 20:10
@newhoggy newhoggy force-pushed the CAD-2371-metrics-for-blocks-gained-and-lost-due-to-switching branch 4 times, most recently from 9fc85dc to 5a15096 Compare February 15, 2021 04:11
@newhoggy newhoggy force-pushed the CAD-2371-metrics-for-blocks-gained-and-lost-due-to-switching branch from 5a15096 to ae05e7c Compare February 15, 2021 04:37
@newhoggy newhoggy marked this pull request as ready for review February 15, 2021 04:37
@newhoggy newhoggy force-pushed the CAD-2371-metrics-for-blocks-gained-and-lost-due-to-switching branch from ae05e7c to 2955cfd Compare February 15, 2021 05:15
@newhoggy newhoggy changed the title CAD-2371 Metrics for blocks gained and lost due to switching CAD-2371 Metrics for blocks lost due to switching Feb 15, 2021
@@ -969,19 +997,23 @@ data ChainInformation = ChainInformation
, slotInEpoch :: Word64
-- ^ Relative slot number of the tip of the current chain within the
-- epoch.
, blocksLost :: Int64
-- ^ Blocks lost due to switch at fork
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do end up following my overall proposal, then this comment is wrong and also the name is misleading.

I'd call it myBlocksSelected, an the comment would say something along the lines of: it's the difference between the numbers of my blocks on the latest selection minus that same number as of the last restart. Thus comparing myBlocksSelected to the number of blocks I've forged since the last restart shows how many blocks I've forged for naught.

Copy link
Contributor

@nfrisby nfrisby Feb 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like that ^^^ plan would require another new metric that simply increments with each TraceForgedBlock event.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the code to what I think you mean.

Copy link
Contributor

@nfrisby nfrisby Feb 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, after a little more thinking, I actually think one metric will suffice, and it can be "blocks forged since last restart but not currently selected". (Edit: that's a bit of an overstatement, but it's a reasonable interpretation of this metric as a heuristic.)

TraceForgedBlock should increment the metric. SwitchedToAFork and AddedToCurrentChain should both increment it for each block of ours on the old chain and also decrement it for each block of ours on the new chain.

If we restart and then immediately switch to a fork with more of ours blocks, then this metric might dip negative. But over time, we should generally see this metric gradually increase.

Does that make sense? I'm headed to bed now -- talk to you tomorrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like TraceForgedBlock shouldn't increment the metric because when we forge a block it is appended to the current chain and is therefore not one of the unselected ones?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TraceForgedBlock just means we created the block and are about to add it to the ChainDB. Once it's added, then the ChainDB might select it, which would incur a separate AddedToCurrentChain event. So decrement for the forge, then increment for the subsequent selection, thus giving us the desired delta of 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've used the term "uncoupled" to mean blocks forged not on the current chain, which is the thing being counted.

@newhoggy newhoggy force-pushed the CAD-2371-metrics-for-blocks-gained-and-lost-due-to-switching branch 4 times, most recently from 4c48111 to 483c688 Compare February 15, 2021 08:14
traceCounter "blocksForgedNum" tr
=<< mapForgingCurrentThreadStats fStats
(\fts -> (fts { ftsBlocksForgedNum = ftsBlocksForgedNum fts + 1 },
ftsBlocksForgedNum fts + 1))
meta <- mkLOMeta Critical Public
let tBlocksUncoupled = fsBlocksUncoupled fStats
traceI tr meta "mySelectedBlocks" =<< STM.modifyReadTVarIO tBlocksUncoupled (+ 1)
Copy link
Contributor

@nfrisby nfrisby Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it's a +1 to the TVar because blocksUncoupledDelta counts blocks not on the current chain. And even though we just forged it, it's not on the chain until we later (though very soon) get a corresponding AddedToCurrentChain, which is handled higher up in the diff. 👍

@nfrisby
Copy link
Contributor

nfrisby commented Feb 24, 2021

@newhoggy Have you discussed this concern with @disassembler ? I haven't.

#2321 (comment)

@newhoggy
Copy link
Contributor Author

Reproducing this comment from IntersectMBO/ouroboros-network#2930 (comment)

Thanks for the update. You mentioned we were using the cold keys, but @disassembler says we should be using the vrf keys.

This makes sense because other nodes use the vrf keys to verify that blocks a legit and it isn't cycled as I understand.

@newhoggy newhoggy force-pushed the CAD-2371-metrics-for-blocks-gained-and-lost-due-to-switching branch 2 times, most recently from 0ea8343 to 5839b30 Compare February 25, 2021 01:03
@nfrisby
Copy link
Contributor

nfrisby commented Feb 26, 2021

After a group convo on Slack, I merged IntersectMBO/ouroboros-network#2930 as-is. So I think this PR will also be ready once it bumps the ouroboros-network dep to something including the current tip of master there:5IntersectMBO/ouroboros-network@7cb1380

My summary of the slack thread: we only use the verification key of the cold key (and we're already using that for Shelley block selection), and using the cold key as "self" for this metric is appropriate.

@newhoggy newhoggy force-pushed the CAD-2371-metrics-for-blocks-gained-and-lost-due-to-switching branch 16 times, most recently from 7454b25 to d192ace Compare March 5, 2021 07:03
…id chaing parser implementations causing tests to fail.
@newhoggy newhoggy force-pushed the CAD-2371-metrics-for-blocks-gained-and-lost-due-to-switching branch from d192ace to e918a65 Compare March 5, 2021 07:30
@newhoggy
Copy link
Contributor Author

newhoggy commented Mar 5, 2021

bors merge

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Mar 5, 2021

@iohk-bors iohk-bors bot merged commit cc3c3cc into master Mar 5, 2021
@iohk-bors iohk-bors bot deleted the CAD-2371-metrics-for-blocks-gained-and-lost-due-to-switching branch March 5, 2021 10:42
@nfrisby
Copy link
Contributor

nfrisby commented Mar 8, 2021

I re-reviewed the new metric itself: still LGTM. I did not review the other eg testing changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants