-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing issue #1961 with incorrect "All users spawned" log messages wh… #1977
Fixing issue #1961 with incorrect "All users spawned" log messages wh… #1977
Conversation
locust/runners.py
Outdated
@@ -747,32 +747,33 @@ def start(self, user_count: int, spawn_rate: float, wait=False) -> None: | |||
# when the user count is really at the desired value. | |||
timeout = gevent.Timeout(self._wait_for_workers_report_after_ramp_up()) | |||
timeout.start() | |||
msg_prefix = "All users spawned" | |||
try: | |||
while self.user_count != self.target_user_count: | |||
gevent.sleep() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to change this to gevent.sleep(0.01) or something, to prevent busy waiting. What do you think @mboutet ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense
Would this delay the logging (as well as spawning_complete.fire) by a whole second? That is a bit unfortunate, but maybe unavoidable.. |
The delay for a second will only be if the worker does not make a report on time. In most cases 100 ms is enough. But unfortunately, it is not known how much it takes when a delay occurs, whether the worker reports at all.
Do you want to sleep(0.01) to reduce the CPU load while we wait? |
Cool, then it is a minor issue.
Yes please. If you do that, and resolve the conflicts, I'd be happy to merge. |
Of course, I'll add edits. I think the report troubles message will not be superfluous anyway. However, if the bug is repeated again, a deeper analysis will be needed, why the last report sometimes is missed. |
db48725
to
7bfe320
Compare
88ab6f9
to
aa256cf
Compare
@cyberw @mboutet Line 840 in 87c3dd1
So the test falls because master miss all workers (for ~300 ms instead of 3 second HEARTBEAT_LIVENESS) in heartbeat_worker while he wait reports for 1 second (was 0.1 second before). locust/locust/test/test_runners.py Line 1897 in 87c3dd1
|
Hmm.. I am pretty sure there is no way for a sleep to take a shorter amount of time than specified... |
Yes, that's certainly not possible. It could take longer if the event loop is busy, but not less. |
This is possible if several lines above make a patch. ))) Didn't guess to check it out yesterday. locust/locust/test/test_runners.py Line 1896 in 87c3dd1
|
Yup! |
f53c3b2
to
8e1e461
Compare
…sages when running distributed by increasing wait time. Now if the timeout of worker reports is expired, log the message about it. Also mock LOCUST_WAIT_FOR_WORKERS_REPORT_AFTER_RAMP_UP with old timeout in a test, in another case would have to increase sleep timers in the test.
8e1e461
to
9fb91d4
Compare
Awesome. Thanks! |
Issue #1961 I saw this bug on a remote machine. I can't debug there, but I think the reason is that 100ms sometimes is not enough for workers (I guess, the last report from last worker) to send reports (or may be to receive and handle on master side). PR requires polish, but you can see the essence.