Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add block production and consensus cycle metrics #1516

Merged
merged 17 commits into from
Apr 21, 2021

Conversation

trianglesphere
Copy link
Contributor

@trianglesphere trianglesphere commented Apr 15, 2021

Description

Adds better metrics for load testing.
Also adds a command line option --metrics.loadtestcsvfile to specify a file to write out information about the block production and consensus cycle in a CSV format. This was previously on a separate branch that was cherry-picked onto the branch that we wanted to loadtest.

Tested

Local load test.

Related issues

Joshua Gutow added 4 commits April 15, 2021 11:21
Adds the following metrics
sleepGauge			consensus/istanbul/backend/sleep
consensusPrepareTimeGuage	consensus/istanbul/core/consensus_prepare
consensusCommitTimeGuage	consensus/istanbul/core/consensus_commit
verifyGauge			consensus/istanbul/core/verify
handlePrePrepareTimer		consensus/istanbul/core/handle_preprepare
handlePrepareTimer		consensus/istanbul/core/handle_prepare
handleCommitTimer		consensus/istanbul/core/handle_commit
blockConstructGuage		miner/worker/block_construct

All metrics are in nanoseconds. Times that occur once per block are recorded
using gauges rather than timers because timers are prometheus summaries, i.e.
they calculate running quantiles from the times and are hard to combine across
multiple nodes. For the more frequent actions, it is still hard to combine
the metrics across multiple nodes, but the quantiles for message handling
are still useful from a single node.
@trianglesphere trianglesphere requested review from gastonponti and a team as code owners April 15, 2021 20:52
@trianglesphere trianglesphere requested review from oneeman and removed request for a team April 15, 2021 20:52
miner/worker.go Outdated Show resolved Hide resolved
@oneeman
Copy link
Contributor

oneeman commented Apr 20, 2021

Unless there's a strong reason against it, I would prefer having to specify a filename for the csv (i.e. instead of a boolean flag to enable csv, a flag that has the filename) and outputting it there rather to stdout. If nothing else, it would mean outputting a clean csv without any unrelated output from geth (e.g. the starting and stopping messages)

@trianglesphere
Copy link
Contributor Author

Unless there's a strong reason against it, I would prefer having to specify a filename for the csv (i.e. instead of a boolean flag to enable csv, a flag that has the filename) and outputting it there rather to stdout. If nothing else, it would mean outputting a clean csv without any unrelated output from geth (e.g. the starting and stopping messages)

Yea, using the filename definitely makes more sense. This was copied what was already done (which at various points filtered out extraneous output). It's better to switch that than emulate the weird behavior here.

Uses it based on a config option instead.

// A CSVRecorder enables easy writing of CSV data a specified writer.
// The header is written on creation. Writing is thread safe.
type CSVRecorder struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say to use https://golang.org/pkg/encoding/csv/ rather than do all the formatting yourself. Generally speaking it's better to use the library rather than DIY, since CSV has many edge cases, like if one of the values has a comma in it (though that shouldn't be the case here). Though it does require the row to consist of strings, whereas here we're getting interface{}, but that seems manageable.

Given the limited nature and relatively simplicity of the CSV we're generating, I don't feel super-strongly about this, but I think it's a best-practice that's always a good idea to follow (especially when the language has csv support built-in and you don't need an external dependency).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, using the the library definitely makes sense.

What do you think about switching the interface to be []string (or varags string) and then calling out the csv library. I think I still want to wrap the library (specifically to be able to eventually close the file).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've switched to using the csv library. It makes things a little harder b/c there's no way easy way to convert primitive types (like int64) to string values outside the fmt package, but it works fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is strconv.FormatInt() for int64, so we could use that. But, yeah, we definitely need to convert to strings one way or another. I don't have a strong preference on whether it should be inside or outside CSVRecorder.

Joshua Gutow added 5 commits April 20, 2021 14:19
Still wrap to be able to close the file and provide locking.
Follow the upstream interface which requires a conversion to strings
in the backend.
metrics/csv_metrics.go Outdated Show resolved Hide resolved
Copy link
Contributor

@mcortesi mcortesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something i wonder here is that we add a couple of metrics that seems to be designed for the CSV Recorder. if that's the case, we can create & report them ONLY if the csv file was given.

A way to do that is that instead of having a backend.csvRecorder have a backend.performanceWatcher that then exposes function do each type of reporting. And so if not present, no reporting happens.

If it helps you can also have a nilWatcher does is a noop, to avoid the if x != nil on every step

consensus/istanbul/core/commit.go Outdated Show resolved Hide resolved
consensus/istanbul/core/prepare.go Outdated Show resolved Hide resolved
consensus/istanbul/core/preprepare.go Outdated Show resolved Hide resolved
metrics/csv_metrics.go Outdated Show resolved Hide resolved
}

// NewStdoutCSVRecorder creates a CSV recorder that writes to the supplied writer.
// The writer is aborbed into the recorder and the user is responsible for calling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absorbed has a typo, but also I'm not sure if absorbed is the best word here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch to "retained" and making it clear that it current only accepts a WriterCloser

Copy link
Contributor

@oneeman oneeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few nits, but otherwise this looks ready.

@trianglesphere
Copy link
Contributor Author

Something i wonder here is that we add a couple of metrics that seems to be designed for the CSV Recorder. if that's the case, we can create & report them ONLY if the csv file was given.

All of the metrics are actually used when load testing/re-purposed from existing metrics. There's actually a few different use cases pulled together in this PR: loadtesting w/ metrics, loadtesting with csvs, better mainnet metrics. I wouldn't really classify any of these metrics as expensive metrics either.

@mergify mergify bot merged commit ede6fa2 into master Apr 21, 2021
@mergify mergify bot deleted the trianglesphere/mainline_metrics branch April 21, 2021 13:34
oneeman pushed a commit that referenced this pull request May 4, 2021
### Description

Adds better metrics for load testing.
Also adds a command line option `--metrics.loadtestcsvfile` to specify a file to write out information about the block production and consensus cycle in a CSV format. This was previously on a separate branch that was cherry-picked onto the branch that we wanted to loadtest.

### Tested

Local load test.

### Related issues

- Fixes #1511
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add loadtest reporting flag to geth
3 participants