Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workload: extend the command-line interface #37929

Merged
merged 1 commit into from
Aug 14, 2019

Conversation

knz
Copy link
Contributor

@knz knz commented May 30, 2019

Release note (cli change): The cockroach workload command now
supports additional command-line parameters to customize the output,
in order to facilitate the integration with 3rd party testing tools:

  • for tools that wish to observe the metrics more frequently than
    every second, a new flag --display-every is now supported, which
    can be used to specify the period between metric reports.
    This applies to both the JSON and textual output.

  • for tools that require a different output format than the default,
    a new --display-format argument is supported. For now
    only the formats "simple" (original output format) and
    "incremental-json" (RFC3339 timestamps, no summary row)
    are supported.

@knz knz requested a review from danhhz May 30, 2019 09:46
@knz knz changed the title workload: extand the command-line interface workload: extend the command-line interface May 30, 2019
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz knz force-pushed the 20190530-workload branch 2 times, most recently from 5f94bc2 to c21cce9 Compare May 30, 2019 11:43
Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you talk a little more about the motivation behind this? The --skip-final-report makes me wonder if you're planning on parsing these output lines. If that's the case, I'd much prefer that we introduce a --json option

Also add a release note please. workload run is part of our public api now.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @danhhz and @knz)


pkg/workload/cli/run.go, line 61 at r1 (raw file):

var pprofport = initFlags.Int("pprofport", 33333, "Port for pprof endpoint.")

var disp = runFlags.Duration("display-every", time.Second, "How much time between every one-line activity reports.")

nit: Avoid abbreviations please: s/disp/displayEvery/

Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you talk a little more about the motivation behind this?

I added more details in the (new) release note.

The --skip-final-report makes me wonder if you're planning on parsing these output lines. If that's the case, I'd much prefer that we introduce a --json option

The change applies both to the textual output and the JSON output, so I am not sure what you are asking?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/workload/cli/run.go, line 61 at r1 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

nit: Avoid abbreviations please: s/disp/displayEvery/

Done.

Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused about the tooling that you're building which needs these changes. Why does the textual output need to be absolute time? Why can't the tooling ignore the summary line? Flags add complexity to the API and, especially since this part of the public surface area of the cockroach cli, I'd like to be sure the complexity is warranted

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@tbg
Copy link
Member

tbg commented May 30, 2019

My understanding is that you need this for your MVP for scenario-based testing, but I thought that MVP was not going to be checked in, which means that we also shouldn't check in aux changes required by it. I share Dan's concern that these flags are public complexity that I'm not sure is justified (at this moment).

Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I'll just leave this PR open until further clarity is delivered.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@knz knz force-pushed the 20190530-workload branch 2 times, most recently from 1f263a4 to d35e070 Compare July 29, 2019 18:45
@knz
Copy link
Contributor Author

knz commented Jul 29, 2019

@danhhz as discussed I have reworked the PR to introduce the notion of "multiple output formats". PTAL.

Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left for format bikeshedding, but everything else looks great!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/workload/cli/format.go, line 22 at r2 (raw file):

// outputFormat is the interface used to output results incrementally
// during a workload run.
type outputFormat interface {

👍


pkg/workload/cli/format.go, line 121 at r2 (raw file):

func (f *rawFormatter) outputTick(startElapsed time.Duration, t histogram.Tick) {
	fmt.Printf("%s %d %.2f %.2f %.2f %.2f %.2f %.2f %s\n",

I understand from our conversation offline that you object to JSON because it's not easily parsed by regex. I am uncomfortable with anything that does not have an easy migration story if we ever need to change what's included here, for example adding p75 or removing a field. For this output, we could append new columns at the end (and deprecating means leaving a placeholder indefinitely), but I wonder if there is a format that meets both our criteria...

How about something like csv with key=value entries? That solves any future delimiter escaping problems we might have, has a straightforward-ish migration story, and is parseable by regex (modulo byzantine examples). I'm also open to other ideas, this one is pretty off the cuff

time=2019-01-01-<...>,errs=0,...p50=1.2,p75=3.4,...,name=foobar


pkg/workload/cli/format_test.go, line 19 at r2 (raw file):

)

func Example_text_formatter() {

nice use of example tests!!!

Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @danhhz)


pkg/workload/cli/format.go, line 121 at r2 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

I understand from our conversation offline that you object to JSON because it's not easily parsed by regex. I am uncomfortable with anything that does not have an easy migration story if we ever need to change what's included here, for example adding p75 or removing a field. For this output, we could append new columns at the end (and deprecating means leaving a placeholder indefinitely), but I wonder if there is a format that meets both our criteria...

How about something like csv with key=value entries? That solves any future delimiter escaping problems we might have, has a straightforward-ish migration story, and is parseable by regex (modulo byzantine examples). I'm also open to other ideas, this one is pretty off the cuff

time=2019-01-01-<...>,errs=0,...p50=1.2,p75=3.4,...,name=foobar

I don't object. I'll try it out.

Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @danhhz)


pkg/workload/cli/format.go, line 121 at r2 (raw file):

Previously, knz (kena) wrote…

I don't object. I'll try it out.

I don;'t know if you realized that but your proposal is also valid json (modulo surrounding {} and replacing "=" by ":")

@knz
Copy link
Contributor Author

knz commented Jul 29, 2019

ok, PTAL

Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @danhhz and @knz)


pkg/workload/cli/format.go, line 121 at r3 (raw file):

func (f *jsonFormatter) outputTick(startElapsed time.Duration, t histogram.Tick) {
	// Note: we use fmt.Printf here instead of json.Marshal to ensure

Hmm, I don't think what you have below is valid json. Don't the field names have to be quoted? (t.Name also will not be properly escaped, but I care about that less). Why do the extra decimals matter in something designed for consumption by machines?

Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @danhhz)


pkg/workload/cli/format.go, line 121 at r3 (raw file):

Previously, danhhz (Daniel Harrison) wrote…

Hmm, I don't think what you have below is valid json. Don't the field names have to be quoted? (t.Name also will not be properly escaped, but I care about that less). Why do the extra decimals matter in something designed for consumption by machines?

No the field names don't have to be quoted IIRC.
The extra decimals because that's a lot of noise (60% more data to parse, in fact) and it kills the reproducibility of unit tests.

@knz
Copy link
Contributor Author

knz commented Jul 29, 2019

hmm it appears that json does want to quote the property names. It's Javascript and python dicts that don't need that. I decidedly don't like json.

@knz
Copy link
Contributor Author

knz commented Jul 29, 2019

ok I dropped the idea to use JSON altogether, coming back to some simpler format with explicit column names on each row. RFAL

Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should bite the bullet and use JSON here. It's not perfect, but it's a very common standard and the custom format isn't getting us enough to be worth the overhead of not using something standard (if any other tooling gets built on top of this, they'll have to duplicate your parser wheras they'd very likely get json for free).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @danhhz)

@knz
Copy link
Contributor Author

knz commented Jul 30, 2019

Dan I'm going to push back on this

  1. the regexps for json are really unreadable
  2. the unit testing code for the new format is hard-ish to both write and properly understand/maintain if we let it do JSON with full precision floats

So unless you're volunteering to write the code, I'd like to stick to what I have and say "if we ever need JSON, we can add a new formatter later"

@danhhz
Copy link
Contributor

danhhz commented Jul 30, 2019

I don't think adding a new format for each tool that wants to consume this is worthwhile complexity. I think you're overstating the complication of both the parsing regexes and the unit testing. Can you elaborate?

So unless you're volunteering to write the code

This is not an acceptable way to conduct code reviews. I am in good faith pushing back against what I feel is unnecessary complexity, both in parsing (json is easy, custom formats have lots of gotchas we'll have to work through one by one that json has already solved) and the idea of having a custom output format specifically for your tool. I am open to the idea that I'm missing something. I respect your ability as an engineer and if you're pushing back against what seems obvious to me, there's likely a good reason. But so far, the arguments you've made on this PR and in person haven't really held up

@knz
Copy link
Contributor Author

knz commented Jul 30, 2019

Show me how to do the unit testing for the format when the float values are all over the place. The Example_ thing that you praised initially simply breaks down miserably with JSON.

@danhhz
Copy link
Contributor

danhhz commented Jul 30, 2019

I haven't tried it, so I don't know where the non-determinism is coming from, but if it's truly unworkable then we can move away from the example tests and do whatever assertions we want in a normal unit test

@knz
Copy link
Contributor Author

knz commented Aug 14, 2019

Rebased and modified to use JSON as discussed. RFAL

@knz knz force-pushed the 20190530-workload branch 2 times, most recently from 34e5b37 to d24d8dd Compare August 14, 2019 09:16
Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: your commit message seems to have staled (still references "incremental-kv")

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @danhhz and @knz)


pkg/workload/cli/run.go, line 317 at r4 (raw file):

	case "simple":
		formatter = &textFormatter{}
	case "incremental-json":

what's incremental about it?

@knz
Copy link
Contributor Author

knz commented Aug 14, 2019

what's incremental about it?

It does not print the final summary

@knz
Copy link
Contributor Author

knz commented Aug 14, 2019

your commit message seems to have staled (still references "incremental-kv")

fixed, thanks

@knz
Copy link
Contributor Author

knz commented Aug 14, 2019

TFYR!

bors r+

@knz
Copy link
Contributor Author

knz commented Aug 14, 2019

typo

bors r-

@craig
Copy link
Contributor

craig bot commented Aug 14, 2019

Canceled

Release note (cli change): The `cockroach workload` command now
supports additional command-line parameters to customize the output,
in order to facilitate the integration with 3rd party testing tools:

- for tools that wish to observe the metrics more frequently than
  every second, a new flag `--display-every` is now supported, which
  can be used to specify the period between metric reports.
  This applies to both the JSON and textual output.

- for tools that require a different output format than the default,
  a new `--display-format` argument is supported. For now
  only the formats "simple" (original output format) and
  "incremental-json" (RFC3339 timestamps, no summary row)
  are supported.
@knz
Copy link
Contributor Author

knz commented Aug 14, 2019

bors r+

craig bot pushed a commit that referenced this pull request Aug 14, 2019
37929: workload: extend the command-line interface r=knz a=knz

Release note (cli change): The `cockroach workload` command now
supports additional command-line parameters to customize the output,
in order to facilitate the integration with 3rd party testing tools:

- for tools that wish to observe the metrics more frequently than
  every second, a new flag `--display-every` is now supported, which
  can be used to specify the period between metric reports.
  This applies to both the JSON and textual output.

- for tools that require a different output format than the default,
  a new `--display-format` argument is supported. For now
  only the formats "simple" (original output format) and
  "incremental-json" (RFC3339 timestamps, no summary row)
  are supported.

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Aug 14, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants