Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to display intermediate result while the test is running #1849

Conversation

andriisoldatenko
Copy link
Contributor

@andriisoldatenko andriisoldatenko commented Feb 9, 2021

Closes #1315

https://terminalizer.com/view/fc96dd924634

Example of usage.

$ go run main.go run scripts.js

          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: scripts.js
     output: -

  scenarios: (100.00%) 1 scenario, 1 max VUs, 10m30s max duration (incl. graceful stop):
           * default: 1 iterations for each of 1 VUs (maxDuration: 10m0s, gracefulStop: 30s)

Press R to get summary or esc to run Ctrl-C/Ctrl-Z

     http_req_blocked...........: avg=171.47ms min=171.47ms med=171.47ms max=171.47ms p(90)=171.47ms p(95)=171.47ms
     http_req_connecting........: avg=113.95ms min=113.95ms med=113.95ms max=113.95ms p(90)=113.95ms p(95)=113.95ms
     http_req_duration..........: avg=114.69ms min=114.69ms med=114.69ms max=114.69ms p(90)=114.69ms p(95)=114.69ms
     http_req_receiving.........: avg=1.97ms   min=1.97ms   med=1.97ms   max=1.97ms   p(90)=1.97ms   p(95)=1.97ms
     http_req_sending...........: avg=103µs    min=103µs    med=103µs    max=103µs    p(90)=103µs    p(95)=103µs
     http_req_tls_handshaking...: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_waiting...........: avg=112.62ms min=112.62ms med=112.62ms max=112.62ms p(90)=112.62ms p(95)=112.62ms
     http_reqs..................: 1 0.730317/s
     vus........................: 1 min=1 max=1
     vus_max....................: 1 min=1 max=1


     http_req_blocked...........: avg=171.47ms min=171.47ms med=171.47ms max=171.47ms p(90)=171.47ms p(95)=171.47ms
     http_req_connecting........: avg=113.95ms min=113.95ms med=113.95ms max=113.95ms p(90)=113.95ms p(95)=113.95ms
     http_req_duration..........: avg=114.69ms min=114.69ms med=114.69ms max=114.69ms p(90)=114.69ms p(95)=114.69ms
     http_req_receiving.........: avg=1.97ms   min=1.97ms   med=1.97ms   max=1.97ms   p(90)=1.97ms   p(95)=1.97ms
     http_req_sending...........: avg=103µs    min=103µs    med=103µs    max=103µs    p(90)=103µs    p(95)=103µs
     http_req_tls_handshaking...: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_waiting...........: avg=112.62ms min=112.62ms med=112.62ms max=112.62ms p(90)=112.62ms p(95)=112.62ms
     http_reqs..................: 1 0.461023/s
     vus........................: 1 min=1 max=1
     vus_max....................: 1 min=1 max=1

^C
running (00m04.2s), 0/1 VUs, 0 complete and 1 interrupted iterations
default ✗ [--------------------------------------] 1 VUs  00m04.2s/10m0s  0/1 iters, 1 per VU
WARN[0004] No script iterations finished, consider making the test duration longer

     data_received..............: 11 kB 2.6 kB/s
     data_sent..................: 76 B  18 B/s
     http_req_blocked...........: avg=171.47ms min=171.47ms med=171.47ms max=171.47ms p(90)=171.47ms p(95)=171.47ms
     http_req_connecting........: avg=113.95ms min=113.95ms med=113.95ms max=113.95ms p(90)=113.95ms p(95)=113.95ms
     http_req_duration..........: avg=114.69ms min=114.69ms med=114.69ms max=114.69ms p(90)=114.69ms p(95)=114.69ms
     http_req_receiving.........: avg=1.97ms   min=1.97ms   med=1.97ms   max=1.97ms   p(90)=1.97ms   p(95)=1.97ms
     http_req_sending...........: avg=103µs    min=103µs    med=103µs    max=103µs    p(90)=103µs    p(95)=103µs
     http_req_tls_handshaking...: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_waiting...........: avg=112.62ms min=112.62ms med=112.62ms max=112.62ms p(90)=112.62ms p(95)=112.62ms
     http_reqs..................: 1     0.239332/s
     vus........................: 1     min=1 max=1
     vus_max....................: 1     min=1 max=1

@na-- na-- requested review from na--, mstoykov and imiric February 9, 2021 12:18
@na--
Copy link
Member

na-- commented Feb 9, 2021

Thanks, we'll take a look soon! Before that, can you forego the update of the golang.org/x/sys dependency, since it's just adding noise to this PR? We can update it in a separate PR later, together with other dependencies.

@andriisoldatenko andriisoldatenko force-pushed the implement-get-summary-when-press-r branch from 9fd84f2 to 81bed3b Compare February 9, 2021 13:05
@andriisoldatenko
Copy link
Contributor Author

Thanks, we'll take a look soon! Before that, can you forego the update of the golang.org/x/sys dependency, since it's just adding noise to this PR? We can update it in a separate PR later, together with other dependencies.

done.

@andriisoldatenko andriisoldatenko force-pushed the implement-get-summary-when-press-r branch 2 times, most recently from 35d979a to f6e81ca Compare February 9, 2021 13:51
@codecov-io
Copy link

codecov-io commented Feb 9, 2021

Codecov Report

Merging #1849 (68f91c3) into master (1181117) will decrease coverage by 0.10%.
The diff coverage is 3.57%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1849      +/-   ##
==========================================
- Coverage   71.57%   71.46%   -0.11%     
==========================================
  Files         179      179              
  Lines       13930    13957      +27     
==========================================
+ Hits         9971     9975       +4     
- Misses       3327     3350      +23     
  Partials      632      632              
Flag Coverage Δ
ubuntu 71.44% <3.57%> (-0.11%) ⬇️
windows 71.09% <3.57%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
cmd/run.go 10.76% <3.57%> (-1.25%) ⬇️
js/runner.go 81.44% <0.00%> (+0.57%) ⬆️
lib/testutils/minirunner/minirunner.go 80.43% <0.00%> (+4.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1181117...e1a6e67. Read the comment docs.

Copy link
Member

@na-- na-- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the PR! While it technically satisfies most of the requirements of the issue, there are some bugs, as well as several things that aren't done optimally - both from an UX and from an implementation perspective. I've noted some of them as inline comments and I'll explain the rest below.

First, there seem to be some data races when we run k6 with go run -race:

Write at 0x00c000d94000 by goroutine 26:
  github.com/loadimpact/k6/cmd.showProgress.func2()
      github.com/loadimpact/k6/cmd/ui.go:296 +0x1ef
  github.com/loadimpact/k6/cmd.showProgress()
      github.com/loadimpact/k6/cmd/ui.go:363 +0x3c5
  github.com/loadimpact/k6/cmd.getRunCmd.func1.1()
      github.com/loadimpact/k6/cmd/run.go:191 +0x2cd
      
Previous read at 0x00c000d94000 by goroutine 32:
  github.com/loadimpact/k6/cmd.showProgress.func1()
      github.com/loadimpact/k6/cmd/ui.go:288 +0x90
  github.com/loadimpact/k6/cmd.(*consoleWriter).Write()
      github.com/loadimpact/k6/cmd/ui.go:73 +0x24a
  bytes.(*Buffer).WriteTo()
      bytes/buffer.go:239 +0x1e7
  io.copyBuffer()
      io/io.go:391 +0x482
  io.Copy()
      io/io.go:368 +0x464
  github.com/loadimpact/k6/cmd.handleSummaryResult()
      github.com/loadimpact/k6/cmd/run.go:453 +0x474
  github.com/loadimpact/k6/cmd.printSummaryResults()
      github.com/loadimpact/k6/cmd/run.go:473 +0x2e4
  github.com/loadimpact/k6/cmd.readInput()
      github.com/loadimpact/k6/cmd/run.go:502 +0x225
  github.com/loadimpact/k6/cmd.getRunCmd.func1.4()
      github.com/loadimpact/k6/cmd/run.go:258 +0x8a

and

Read at 0x00c00270a608 by goroutine 32:
  github.com/loadimpact/k6/stats.(*RateSink).Calc()
      <autogenerated>:1 +0x4f
  github.com/loadimpact/k6/js.metricValueGetter.func1()
      github.com/loadimpact/k6/js/summary.go:129 +0x72
  github.com/loadimpact/k6/js.summarizeMetricsToObject()
      github.com/loadimpact/k6/js/summary.go:176 +0x638
  github.com/loadimpact/k6/js.(*Runner).HandleSummary()
      github.com/loadimpact/k6/js/runner.go:312 +0x117
  github.com/loadimpact/k6/cmd.printSummaryResults()
      github.com/loadimpact/k6/cmd/run.go:467 +0x1ab
  github.com/loadimpact/k6/cmd.readInput()
      github.com/loadimpact/k6/cmd/run.go:502 +0x225
  github.com/loadimpact/k6/cmd.getRunCmd.func1.4()
      github.com/loadimpact/k6/cmd/run.go:258 +0x8a
      
Previous write at 0x00c00270a608 by goroutine 48:
  github.com/loadimpact/k6/stats.(*RateSink).Add()
      github.com/loadimpact/k6/stats/sink.go:168 +0x64
  github.com/loadimpact/k6/core.(*Engine).processSamplesForMetrics()
      github.com/loadimpact/k6/core/engine.go:413 +0xc1b
  github.com/loadimpact/k6/core.(*Engine).processSamples()
      github.com/loadimpact/k6/core/engine.go:443 +0x10b
  github.com/loadimpact/k6/core.(*Engine).processMetrics.func2()
      github.com/loadimpact/k6/core/engine.go:275 +0x38e
  github.com/loadimpact/k6/core.(*Engine).processMetrics()
      github.com/loadimpact/k6/core/engine.go:285 +0x333

And this is when the handleSummary() call is synchronous with detecting the key presses, which it probably shouldn't be (see inline comment on the topic). I also realize that testing UI input is very difficult, but some tests of the actual logic here (without any keyboard) would have probably caught at least the second data race in CI. So please add some... 😅

Finally, I am not sure about the https://github.com/eiannone/keyboard dependency... It has some scary open issues, and me and @imiric actually managed to hit a strange bug we think was caused by the dependency. We haven't been able to consistently reproduce it, but we somehow managed to enter a state where hitting Esc would sometimes disable special input entirely, ignoring even Ctrl-C / Ctrl-Z! This messed up the terminal even after k6 exits - Ctrl-C/Ctrl-Z were ignored and are printed instead, and the only fix was running reset in the terminal... 😕 We independently hit the issue while testing your code, him twice and me once, but we can't reproduce it consistently... 🤷‍♂️

Also, in general, I am not sure such a dependency is needed. See this SO answer for example. You can use golang.org/x/term.MakeRaw() to switch the terminal to raw mode (and back to cooked mode at the end of k6) and just read a single character from stdin. There are other ways to check what stdin actually is as well: https://stackoverflow.com/a/26567513/9629802

We already use a precursor to this stdlib-adjacent library, golang.org/x/crypto/ssh/terminal for k6 login forms and progressbars. I think using the same golang.org/x dependency for reading characters from stdin would have a much lower possibility of issues long-term. And if for some reason that library doesn't work on some OS, there seem to be a lot of drop-in replacements that enable setting terminals to raw mode (e.g. https://github.com/containerd/console, https://github.com/pkg/term), whereas this keyboard seems to be its own special thing that doesn't work very well...

cmd/run.go Outdated
@@ -457,3 +457,55 @@ func handleSummaryResult(fs afero.Fs, stdOut, stdErr io.Writer, result map[strin

return consolidateErrorMessage(errs, "Could not save some summary information:")
}

func printSummaryResults(globalCtx context.Context,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a potentially a wrong name - handleSummary() might not print anything to stdout or stderr, it might just write files.

Also, I realize that stdout and stderr are unfortunately global variables. We want to clean that up eventually, and the way to do that is to use dependency injection even with them, i.e. pass them as parameters. See how handleSummaryResult() expects them as parameters, despite also having access to the same global variables directly. As a side-benefit, this also makes it much more testable...

cmd/run.go Outdated
defer func() {
_ = keyboard.Close()
}()
fmt.Println("Press R to get summary or esc to run Ctrl-C/Ctrl-Z")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is suboptimal in two different ways. First, instead of directly printing to stdout with fmt.Println(), it would be much better to use fmt.Fprintf(stdout, "Press R..."), ideally with stdout being injected as parameter.

Second, and more importantly, we wouldn't want to enter a separate state to handle R specifically and ignore Ctrl+C. If you must capture all keyboard input exclusively here, you can simply capture Ctrl+C as well and send an event over sigC, so the Ctrl+C handling occurs normally. And if we don't have a separate mode for the summary, we don't need this message at all.

cmd/run.go Outdated
return err
}
defer func() {
_ = keyboard.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An error here probably deserves at least logging it, you already have the logger anyway.

cmd/run.go Outdated
Comment on lines 495 to 497
forLoop:
for {
event := <-keysEvents
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this code and not for event := range keysEvents {}?

@@ -251,6 +252,15 @@ a commandline interface for interacting with it.`,
os.Exit(externalAbortErrorCode)
}()

// start reading user input
if !runtimeOptions.NoSummary.Bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking if stdin is a terminal here is probably a good idea. I don't know how the github.com/eiannone/keyboard dependency would react if it isn't, but at best, we're going to get a useless error in the logs from logger.Error(riErr) below, when it fails on most CI systems to listen to key presses.

cmd/run.go Outdated
return event.Err
}
if event.Rune == 'R' {
printSummaryResults(globalCtx, runner, engine, executionState, log)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can potentially take quite a long time, e.g. if you have a ton of metrics data or you make HTTP requests in handleSummary(). With the current code, k6 would be stuck waiting for it to complete before it processes any other input, including Esc or Ctrl+C... So, this needs to happen asynchronously.

To easily replicate the issue, try putting sleep(5) in handleSummary() and see how this code behaves.

@andriisoldatenko
Copy link
Contributor Author

@na-- @mstoykov @imiric

few notes:

still there are few small bugs, i'm trying to fix right now. in OSX it's still break terminal size... so
img

Please let me know wdyt regarding new approach?

@andriisoldatenko
Copy link
Contributor Author

on separate note:
assuming that we want unbuffered input (without having to hit enter)?

@na--
Copy link
Member

na-- commented Feb 15, 2021

on separate note:
assuming that we want unbuffered input (without having to hit enter)?

I haven't looked at the new code, but yes, we'd like this feature to not require hitting enter after whatever trigger we've decided is written (R in this case). It's looking like it might prove to be too difficult to implement reasonably, so we could re-evaluate, we'll try to take a look at the code and play with it ourselves tomorrow.

@andriisoldatenko
Copy link
Contributor Author

on separate note:
assuming that we want unbuffered input (without having to hit enter)?

I haven't looked at the new code, but yes, we'd like this feature to not require hitting enter after whatever trigger we've decided is written (R in this case). It's looking like it might prove to be too difficult to implement reasonably, so we could re-evaluate, we'll try to take a look at the code and play with it ourselves tomorrow.

Few links might help when you try on your side:

@imiric
Copy link
Contributor

imiric commented Feb 16, 2021

Wow, so this turned out to be a can of worms... 😅 Sorry Andrii!

There's a couple of issues with the current implementation:

  • Since it's switching the terminal to raw mode on start any output with newlines will be messed up, as you've found out. For me on xterm the running progress bar looks like:

    running (01.7s), 1/5 VUs, 9 complete and 0 interrupted iterations
                                                                     default   [==>-----------------------------------] 1/5 VUs  01.7s/20.0s
    
  • The output after pressing R also looks like the bottom half of your screenshot to me. Pressing Ctrl+C runs the defer in readInput which restores it to cooked mode and the final summary looks fine in that case, but if I let the test exit normally then the final summary is borked, which is likely a race condition and the restore should be done before the summary is printed normally.

    We could address this R issue by switching to cooked mode before printing the summary, and then back again to raw mode to interpret the keystroke, but I don't see how the progress bar can be fixed.

Given that k6 constantly writes to stdout, we can't really switch to raw mode for this feature.

So how about this: interpret Enter in the default cooked mode and avoid switching to raw mode entirely. From my tests this gets interpreted as LF / ASCII 10 and works fine, with no output issues, and we don't need any external library or even bufio for this. I'm not sure how cross-platform this is and whether it's the same LF value on e.g. Windows terminals, but we can test it.

The obvious drawback of this is that we can't listen for any other key presses, but I'd argue that we really shouldn't anyway. Enter is not normally useful anyway, and is a friendlier binding than Shift+r to me.

WDYT?

@mstoykov
Copy link
Collaborator

mstoykov commented Feb 16, 2021

Thanks for the PR @AndriiChuzhynov 🎉

quick thoughts before actually looking at the code:

  • This not hitting enter turns out to too big of a pain IMO and given the problems we usually have with windows, IMO we will need to use something like termbox or tcell to get it to work. This is ... way too complicated, but also gives us more stuff like graphs, but this is definitely WAY outside the scope of this issue. So my vote is to need to hit enter at least for now and fix it later.
  • if you run the code with -race you will see some races that need to be fixed.
  • the races are printed ... weirdly even with 227eaa9
  • running with -v is botching the output as well, probably for the same reason as above
  • even without races and with 227eaa9 or f376629 my terminal sometimes(most actually) still gets botched even if I just leave it without doing anything and I am on Linux with alacritty and tmux.
  • the progressbar is of center and is "moving" leaving 'd'-s from the 'default'
    image

I have the feeling all of those visual artifacts are due to one problem, and fixing them IMO is more important than whether we will be hitting enter or not. I at least don't get my terminal botched if I ^c so 🤷

additionally (although this is likely unfixable for now) if the scripts are read from the stdin as in cat script.js | k6 run - the feature doesn't work, which is expected, just something to be kept in mind. I am not certain something like termbox will help that either way 🤷

edit: I am fine with just hitting enter as well as @imiric proposes.

@andriisoldatenko
Copy link
Contributor Author

@imiric @mstoykov thanks for review and testing! So i'll try to address issues and return with new commit.

@andriisoldatenko
Copy link
Contributor Author

andriisoldatenko commented Feb 16, 2021

@na-- @imiric @mstoykov
if fixed 1 of 2 RC by adding locks around HandleSummary` type:

engine.MetricsLock.Lock()
	summaryResult, err := runner.HandleSummary(globalCtx, &lib.Summary{
		Metrics:         engine.Metrics,
		RootGroup:       engine.ExecutionScheduler.GetRunner().GetDefaultGroup(),
		TestRunDuration: executionState.GetCurrentTestRunDuration(),
	})
	engine.MetricsLock.Unlock()

but than... i get stucked on [io.Copy](https://github.com/loadimpact/k6/blob/a0f18e713ddbb6a201d8628ea98ec07d00243a7c/cmd/run.go#L469) and handleSummaryResult

any ideas where to check to fix this RC:

but concerned how to fix showProgress one:

Write at 0x00c0015ec000 by goroutine 45:
  github.com/loadimpact/k6/cmd.showProgress.func2()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/cmd/ui.go:296 +0x1ef
  github.com/loadimpact/k6/cmd.showProgress()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/cmd/ui.go:363 +0x3c5
  github.com/loadimpact/k6/cmd.getRunCmd.func1.1()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/cmd/run.go:191 +0x2cd

Previous read at 0x00c0015ec000 by goroutine 46:
  github.com/loadimpact/k6/cmd.showProgress.func1()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/cmd/ui.go:288 +0x90
  github.com/loadimpact/k6/cmd.(*consoleWriter).Write()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/cmd/ui.go:73 +0x24a
  github.com/sirupsen/logrus.(*Entry).write()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/vendor/github.com/sirupsen/logrus/entry.go:280 +0x2d4
  github.com/sirupsen/logrus.Entry.log()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/vendor/github.com/sirupsen/logrus/entry.go:251 +0x404
  github.com/sirupsen/logrus.(*Entry).Log()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/vendor/github.com/sirupsen/logrus/entry.go:287 +0x167
  github.com/sirupsen/logrus.(*Entry).Warn()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/vendor/github.com/sirupsen/logrus/entry.go:308 +0x344
  github.com/loadimpact/k6/cmd.getRunCmd.func1.2()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/cmd/run.go:228 +0x2cb

Goroutine 45 (running) created at:
  github.com/loadimpact/k6/cmd.getRunCmd.func1()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/cmd/run.go:186 +0x10aa
  github.com/dop251/goja.(*Runtime).functionproto_call()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/vendor/github.com/dop251/goja/builtin_function.go:109 +0x1a5
  github.com/dop251/goja.(*Runtime).functionproto_call-fm()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/vendor/github.com/dop251/goja/builtin_function.go:102 +0x8d
  github.com/dop251/goja.(*vm)._nativeCall()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/vendor/github.com/dop251/goja/vm.go:1817 +0x7fe
  github.com/dop251/goja.call.exec()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/vendor/github.com/dop251/goja/vm.go:1789 +0x201a
  github.com/dop251/goja.(*call).exec()
      <autogenerated>:1 +0x64
  github.com/dop251/goja.(*vm).run()
      /Users/andrii/workspace/src/github.com/loadimpact/k6/vendor/github.com/dop251/goja/vm.go:307 +0x15e
 

looks like we have global variable to lock outMutex 🤔

also i found i can reproduce this bug from time to time in master branch ^^ using 0dd04a3

issue #1860 has been created.

Copy link
Member

@na-- na-- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the data race and reporting the one we have on master! I've tested your latest code and it works, though it's apparent we have to go back to the drawing board with the feature itself... 😞 The UX of the current approach is not great, but the complexity and fragility of the alternative is also bad, even if we manage to get it working nicely without a full-blown ncurses-like CLI UI... I've added a note in #1315 to that effect and tag the issue as evaluation needed.

The locking you've employed to get rid of the data race also deserves some further consideration and improvements. I'll try to explain, even though there's no point in doing the work for now, given that it doesn't seem likely we'll merge the feature soon because of the other problems with the whole feature.

You are currently locking the whole Engine from processing new metrics and thresholds for the whole duration of handleSummary(), however handleSummary() can potentially take quite a long time to execute if it makes HTTP requests or something like that. If it takes a long time, various VUs can continue generating new metric samples and sending them to the buffered metric Samples channel in the Engine. I think it has a size of 1000 by default, and if that ever overflows, VUs will be blocked, which would affect the load test execution negatively.

A better locking approach instead would be to only lock the Engine.MetricsLock only while we export the data we pass to handleSummary, but not during the actual execution of the callback. So, basically, wrap this line: https://github.com/loadimpact/k6/blob/6bebdd632eceadf6f8b50d4c47814b3a1688e976/js/runner.go#L312

Or do it inside of the actual summarizeMetricsToObject() function: https://github.com/loadimpact/k6/blob/6bebdd632eceadf6f8b50d4c47814b3a1688e976/js/summary.go#L160

But as I said, no need to fix this or spend any more time on improvements until we're sure what we want to do here. And despite the fact we're not going to merge the PR for now, the whole effort has been very useful in illuminating all of the tricky issues here, so thank you very much for that and sorry again for the unexpected complexity!

@na-- na-- added the evaluation needed proposal needs to be validated or tested before fully implementing it in k6 label Feb 17, 2021
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@na--
Copy link
Member

na-- commented Mar 31, 2022

I'll close this so it doesn't clutter the list with open PRs. We are not going to merge it as it is, for the reasons explained in #1849 (review), and any lessons learned can still be gleaned from a closed PR, given that it's linked in #1315 (which I'll keep open).

@na-- na-- closed this Mar 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
evaluation needed proposal needs to be validated or tested before fully implementing it in k6
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability to display intermediate result while the test is running
6 participants