-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arrival rate fixes #2038
Merged
Merged
Arrival rate fixes #2038
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused by this 😕 I read your explanation in 5df0e6b, but if this extra bit of waiting helps all that much, then we probably have a bigger conceptual problem, since this is not very different from
sleep(time.Millisecond)
. Can you point me to what exactly failed when this wasn't present?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically waiting for the goroutine to have started and then it tells you it has Added a VU.
What fails are all the usual TestConstantArrivalRateRunCorrectTiming as shown in https://github.com/k6io/k6/runs/2640090549?check_suite_focus=true, which at least in my testing happens as we :
for range p.iterations
as it probably didn't even initialize the goroutine yet.This is particularly easy to reproduce with
db89d213
andgo test -count 1 -cpu 1 -race -run TestConstantArrivalRateRunCorrectTiming
, specifically the-cpu 1
as with-cpu 8
I in practice can't reproduce it - probably as there is a free slot for that goroutine to always start immediately or at least fast enough.This as we now no longer have a "semapthore" (using the channel
activeVUs
) but instead have workers. The semaphore was initialized synchronously while the workers are asynchronous for the most part - we just say "start this goroutine" and with this change at least wait for the "start the goroutine" part instead of hoping the go will schedule it fast enough. Which is totally different from whattime.Sleep(time....)
will doThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was that if the CPU contention was so heavy, it's possible that a goroutine wouldn't be allowed to run for a long while, even after it was started, so it might not get to run the
for
loop after it closes the channel 🤷♂️I get how this is different from a
Sleep
, but as far as I understand, it's not actually locking anything that raced before... So if it failed before, this doesn't guarantee it will work afterwards? I guess it would just greatly reduce the changes of a test failing, which might be good enough, though a better solution would be to improve the test, with #1986 or something else...So yeah, I don't mind this code, I just didn't understand what it solved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The go statements explicitly don't wait for the goroutine to start. So with
-cpu 1
(what the windows tests effectively do) there is 1 CPU that can execute 1 goroutine and it already is executing the code that initializes the VUs and callsaddVU
(which starts a goroutine with the go statement) and then it goes and says "begin an iteration" at which point there is nothing guaranteeing that the executing goroutine would've yielded between callingaddVU
and this moment so that any of the addVU's goroutine would've had the CPU to run on at all at which point theselect with default
just goes to thedefault
case. This is also not guaranteed if we sleep for an hour or something like that (although I would expect it will happen ;) )The same way that currently nothing guarantees that after calling
close(ch)
it will actually get to therange p.iterations
without yielding it just makes a lot more likely. (to be honest as far as I knowclose(ch)
should not yield so we should always get torange p.iterations
although I have no proof so 🤷, but IMO it makes more sense to execute until you need to block instead of yield whenever ) So much more that I haven't had a failure of this kind at all after these changes and even parallelizing the tests has been more stable (having run them 5 times).On the other hand, the change to not start a goroutine on each iteration makes it less likely each of those starting of goroutines will be a bit slow (because of GC or ... just CPU contention) and will not manage to execute an iteration within 15ms of the time it needs to, which was the original issue in #1715 instead the currently fixing one where we just miss the first iteration.
While I think #1986 will make this even less possible to fail in the test it will not fix the underlying problem that this patch tries to fix in a production call. As it will still there be nothing to guarantee the goroutine will be waiting on the channel to receive the "start an iteration" event. But it will mask it ;). Also to be honest, given the current congestion on
time.
calls that I see on each profile I make I expect that adding an interface call in the middle will have .... detrimental outcomes for the performance, but that will need to be checked whenever someone finds the time to rewrite the code to use it.