Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ramping-arrival-rate: Avoid to spawn a new goroutine for every iteration #1957

Merged
merged 2 commits into from
May 21, 2021
Merged

Conversation

codebien
Copy link
Collaborator

Switched the ramping-arrival-rate from goroutine per iteration to goroutine per VU.
Refactor the Run method is not part of this PR, it is scheduled for the original issue, so I tried to impact in a minimal way the code.

However, the main issue in terms of performance was already resolved from #1955 as the following tests should assert:

using this config:

exports.options = {
    scenarios: {
        test1: {
            executor: 'ramping-arrival-rate',
            preAllocatedVUs: 1000,
            stages:[
                {target:100000, duration:'0s'},
                {target:100000, duration:'15s'},
            ],
        },
    },
};

exports.default =  function () {}

result with v0.31.1 still affected by the problem:

$ docker run -v $PWD:/home/k6 -i loadimpact/k6:0.31.1 run - <script.js

iterations...........: 200591 13368.444337/s

and the result with master

$ docker run -v $PWD:/home/k6 -i loadimpact/k6:master run - <script.js

iterations...........: 1499214 99925.899592/s

Closes #1944

@mstoykov
Copy link
Collaborator

hi @codebien , thanks for the PR 🎉 !

I will try to review it later today or tomorrow, but can you post some benchstat comparisons between pre and post changes.

@codecov-io
Copy link

Codecov Report

Merging #1957 (55c8f78) into master (b84e0af) will increase coverage by 0.03%.
The diff coverage is 100.00%.

❗ Current head 55c8f78 differs from pull request most recent head 147b884. Consider uploading reports for the commit 147b884 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1957      +/-   ##
==========================================
+ Coverage   71.43%   71.47%   +0.03%     
==========================================
  Files         183      183              
  Lines       14247    14251       +4     
==========================================
+ Hits        10178    10186       +8     
+ Misses       3438     3434       -4     
  Partials      631      631              
Flag Coverage Δ
ubuntu 71.40% <100.00%> (+0.02%) ⬆️
windows 71.09% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
lib/executor/ramping_arrival_rate.go 96.48% <100.00%> (+0.10%) ⬆️
js/modules/modules.go 80.00% <0.00%> (-1.82%) ⬇️
js/runner.go 81.44% <0.00%> (+0.57%) ⬆️
core/engine.go 86.03% <0.00%> (+0.90%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b84e0af...147b884. Read the comment docs.

@codebien
Copy link
Collaborator Author

I will try to review it later today or tomorrow, but can you post some benchstat comparisons between pre and post changes.

No improvements 😟




master.bench1944.bench
iterations/sdelta
RampingArrivalRate1000VUs-81.06M ± 4%0.92M ± 2%−13.61%(p=0.029 n=4+4)
 
alloc/opdelta
RampingArrivalRate1000VUs-83.99GB ± 4%3.45GB ± 2%−13.60%(p=0.029 n=4+4)
 
allocs/opdelta
RampingArrivalRate1000VUs-810.6M ± 4%9.2M ± 2%−13.57%(p=0.029 n=4+4)
 
go test -bench=BenchmarkRampingArrivalRate -run=^$ -benchmem -v ./lib/executor
BenchmarkRampingArrivalRateRun/VUs10
BenchmarkRampingArrivalRateRun/VUs10-8         	       1	    952432 iterations/s	3779520344 B/op	13655539 allocs/op
BenchmarkRampingArrivalRateRun/VUs100
BenchmarkRampingArrivalRateRun/VUs100-8        	       1	   1173044 iterations/s	4412040832 B/op	11744001 allocs/op
BenchmarkRampingArrivalRateRun/VUs1000
BenchmarkRampingArrivalRateRun/VUs1000-8       	       1	    854069 iterations/s	3213358384 B/op	 8555731 allocs/op
BenchmarkRampingArrivalRateRun/VUs10000
BenchmarkRampingArrivalRateRun/VUs10000-8      	       1	    619091 iterations/s	2343777264 B/op	 6294554 allocs/op

The current code adds 2xMaxVUs goroutines, a future refactor could consider re-using the goroutines spawn from Activate.

github.com/loadimpact/k6/lib/executor.RampingArrivalRate.Run.func3.1 N=1000
github.com/loadimpact/k6/lib/testutils/minirunner.(*VU).Activate.func1 N=1000 

@codebien codebien marked this pull request as ready for review April 14, 2021 22:09
Copy link
Member

@na-- na-- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noted the problem in an inline comment, but to re-iterate, this change brakes gracefulStop and the assurance that VUs are done executing JS when they are returned to the common pool, which may cause a data race if we try to run another scenario on the same VU (JS runtimes are single-threaded).

@codebien
Copy link
Collaborator Author

Added a test to cover the graceful case, I polished a bit the solution fixing the graceful issue.

na--
na-- previously approved these changes Apr 26, 2021
Copy link
Member

@na-- na-- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides some minor naming nitpicks (for which I don't really have any better ideas 😅)

Comment on lines 471 to 474
// runningVUs controls the activeVUs
// executing the received requests for iterations.
type runningVUs struct {
Iterations chan struct{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ❤️ the approach of splitting this into a separate small component, though I'm not 100% sure about the names... Though my ideas aren't necessarily any better... 😅 VUPool? ActiveVUPool?

In any case, Iterations probably should be not be exported and the comment looks a bit wonky.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer vuPool as runningVUs IMO is "wrong" as they are not all running ... maybe "runnableVUs" but this seems more like activeVUs and also not one of them communicates that the VUs might be running already or might be waiting for them to run something ... so I think vuPool is better IMO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

activeVUPool sounds slightly better to me.

Also TryRunIteration or RunIterationMaybe instead of Run. Not sure about the Try/Maybe, but Iteration should be part of it.

@na-- na-- requested review from mstoykov and imiric April 26, 2021 08:38
Copy link
Collaborator

@mstoykov mstoykov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, good job!

I have some nitpicks, one of which grew while I was writing it out 😅 , but nothing blocking IMO ... but we are in code freeze so ;)

Comment on lines 471 to 474
// runningVUs controls the activeVUs
// executing the received requests for iterations.
type runningVUs struct {
Iterations chan struct{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer vuPool as runningVUs IMO is "wrong" as they are not all running ... maybe "runnableVUs" but this seems more like activeVUs and also not one of them communicates that the VUs might be running already or might be waiting for them to run something ... so I think vuPool is better IMO

type runningVUs struct {
Iterations chan struct{}

listening sync.WaitGroup
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I prefer sync.WaitGroup's to be called wg unless if there are many, in general it should be (as in this case) why the wait group exists. Also I consider listening something to do with networking instead of of listening on a channel ... I would in general called them waiting or blocked on a channel, but again (as in the pool above) they might not actually currently be waiting/listening/blocked on a channel but instead running which in combination with the actual running counter below make it seem like listening is only a wait group of actually currently "listening" VUs, while running is the number of actually running vus. And the last one is actually true - running is the number of currently running vus.

Comment on lines 319 to 325
iterations := make(chan struct{})
iterators := runningVUs{
Iterations: iterations,
}

defer func() {
close(iterations)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer that you encapsualte the iterations as well and have newRunningVUs (or IMO newVUPool) which make the iterations channel and a method that is stop to close iterations and effectively stop the "running" of vus.

Copy link
Contributor

@imiric imiric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, don't have much to add besides the naming issues already mentioned, and that all runningVUs fields/methods don't need to be exported.

Even though the main performance issue was addressed in #1955, from a quick profile run this does get rid of the excessive stack creation mentioned in #1386 (comment), so definitely an improvement. 👍

Comment on lines 471 to 474
// runningVUs controls the activeVUs
// executing the received requests for iterations.
type runningVUs struct {
Iterations chan struct{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

activeVUPool sounds slightly better to me.

Also TryRunIteration or RunIterationMaybe instead of Run. Not sure about the Try/Maybe, but Iteration should be part of it.

lib/executor/ramping_arrival_rate_test.go Outdated Show resolved Hide resolved
Create a dedicated goroutine to process iterations
when a new ActiveVU is created for the Ramping arrival rate.

It changes the previous behaviour where instead
a new goroutine was created for every new iteration.

The previous solution was impacting the expected rate (iterations/s)
using a number of PreAllocatedVUs around thousand.
Added a Benchmark to check the rate with a set of PreAllocatedVUs cases.

Closes #1944
@codecov-commenter
Copy link

codecov-commenter commented Apr 27, 2021

Codecov Report

Merging #1957 (1747c8f) into master (f9a737e) will decrease coverage by 0.43%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1957      +/-   ##
==========================================
- Coverage   71.84%   71.41%   -0.44%     
==========================================
  Files         182      177       -5     
  Lines       14237    14095     -142     
==========================================
- Hits        10229    10066     -163     
- Misses       3367     3387      +20     
- Partials      641      642       +1     
Flag Coverage Δ
ubuntu ?
windows 71.41% <100.00%> (-0.18%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
lib/executor/ramping_arrival_rate.go 96.09% <100.00%> (-0.28%) ⬇️
lib/netext/tls.go 18.75% <0.00%> (-31.25%) ⬇️
loader/readsource.go 65.38% <0.00%> (-19.24%) ⬇️
log/loki.go 33.58% <0.00%> (-5.29%) ⬇️
lib/execution.go 89.32% <0.00%> (-2.92%) ⬇️
core/local/local.go 72.85% <0.00%> (-2.86%) ⬇️
cmd/ui.go 20.61% <0.00%> (-1.99%) ⬇️
js/compiler/compiler.go 66.07% <0.00%> (-1.17%) ⬇️
stats/stats.go 81.40% <0.00%> (-1.03%) ⬇️
js/summary.go 89.04% <0.00%> (-0.96%) ⬇️
... and 35 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f9a737e...1747c8f. Read the comment docs.

@codebien
Copy link
Collaborator Author

Done and Rebased

Renamed the struct ad the run method, unexported iterations chan and reduced the time for the graceful option in the relative test.

@mstoykov mstoykov added this to the v0.33.0 milestone May 15, 2021
imiric
imiric previously approved these changes May 19, 2021
Copy link
Contributor

@imiric imiric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just noticed some search/replace typos :)

lib/executor/ramping_arrival_rate.go Outdated Show resolved Hide resolved
lib/executor/ramping_arrival_rate.go Outdated Show resolved Hide resolved
@imiric imiric requested review from mstoykov and na-- May 19, 2021 15:18
Co-authored-by: Ivan Mirić <ivan@imiric.com>
@mstoykov mstoykov requested a review from imiric May 20, 2021 15:54
Comment on lines +322 to 328
vusPool.Close()
// Make sure all VUs aren't executing iterations anymore, for the cancel()
// below to deactivate them.
<-returnedVUs
cancel()
activeVUsWg.Wait()
}()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should prevent the race from https://github.com/k6io/k6/pull/1957/checks?check_run_id=2631773502

==================
WARNING: DATA RACE
Read at 0x00c000062078 by goroutine 128:
  internal/race.Read()
      /opt/hostedtoolcache/go/1.16.4/x64/src/internal/race/race.go:37 +0x206
  sync.(*WaitGroup).Add()
      /opt/hostedtoolcache/go/1.16.4/x64/src/sync/waitgroup.go:71 +0x219
  go.k6.io/k6/lib/executor.(*activeVUPool).AddVU()
      /home/runner/work/k6/k6/lib/executor/ramping_arrival_rate.go:519 +0x64
  go.k6.io/k6/lib/executor.RampingArrivalRate.Run.func3()
      /home/runner/work/k6/k6/lib/executor/ramping_arrival_rate.go:361 +0x290
  go.k6.io/k6/lib/executor.RampingArrivalRate.Run.func4()
      /home/runner/work/k6/k6/lib/executor/ramping_arrival_rate.go:379 +0x571

Previous write at 0x00c000062078 by goroutine 34:
  internal/race.Write()
      /opt/hostedtoolcache/go/1.16.4/x64/src/internal/race/race.go:41 +0x125
  sync.(*WaitGroup).Wait()
      /opt/hostedtoolcache/go/1.16.4/x64/src/sync/waitgroup.go:128 +0x126
  go.k6.io/k6/lib/executor.(*activeVUPool).Close()
      /home/runner/work/k6/k6/lib/executor/ramping_arrival_rate.go:535 +0x77
  go.k6.io/k6/lib/executor.RampingArrivalRate.Run.func1()
      /home/runner/work/k6/k6/lib/executor/ramping_arrival_rate.go:339 +0x54
  go.k6.io/k6/lib/executor.RampingArrivalRate.Run()
      /home/runner/work/k6/k6/lib/executor/ramping_arrival_rate.go:481 +0x20d0
  go.k6.io/k6/lib/executor.(*RampingArrivalRate).Run()
      <autogenerated>:1 +0x127
  go.k6.io/k6/lib/executor.TestRampingArrivalRateRunUnplannedVUs()
      /home/runner/work/k6/k6/lib/executor/ramping_arrival_rate_test.go:189 +0x7ee
  testing.tRunner()
      /opt/hostedtoolcache/go/1.16.4/x64/src/testing/testing.go:1193 +0x202

As far as I can see the waitgroup was empty at the time as a VU wasn't executing but it was just starting to and it raced between the Add and the Close. This change makes certain we no longer will try to "Add" before we close

Suggested change
vusPool.Close()
// Make sure all VUs aren't executing iterations anymore, for the cancel()
// below to deactivate them.
<-returnedVUs
cancel()
activeVUsWg.Wait()
}()
// Make sure all VUs aren't executing iterations anymore, for the cancel()
// below to deactivate them.
<-returnedVUs
cancel()
vusPool.Close()
activeVUsWg.Wait()
}()

@mstoykov mstoykov merged commit 5fe5e5a into grafana:master May 21, 2021
@codebien codebien deleted the 1944 branch May 21, 2021 12:31
mstoykov added a commit that referenced this pull request May 21, 2021
originally discussed in this comment
#1957 (comment)
mstoykov added a commit that referenced this pull request May 21, 2021
harrytwigg pushed a commit to APITeamLimited/globe-test that referenced this pull request Jan 11, 2023
originally discussed in this comment
grafana/k6#1957 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize the ramping-arrival-rate executor
6 participants