Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After starting test from UI, state shows as STOPPED even though it's running #1535

Closed
max-rocket-internet opened this issue Aug 20, 2020 · 17 comments

Comments

@max-rocket-internet
Copy link
Contributor

Describe the bug

After starting test from UI, state shows as STOPPED even though it's running.

If I click start again, then the status is updated to SPAWNING correctly

Screen Shot 2020-08-20 at 15 32 25

Expected behavior

Status should show as "SPAWNING" I believe.

Steps to reproduce

Just start test from UI.

Environment

  • OS: k8s
  • Locust version: 1.2.1
@max-rocket-internet
Copy link
Contributor Author

I can't really reproduce this reliably though 🤔

@max-rocket-internet
Copy link
Contributor Author

max-rocket-internet commented Aug 20, 2020

bash-5.0$ curl http://locust-crs:8089/swarm --data-raw 'user_count=20000&spawn_rate=20'
{
  "host": "https://01.ldt.xxxx-yyyy.com",
  "message": "Swarming started",
  "success": true
}

# Wait 10 seconds

bash-5.0$ curl -s http://locust-crs:8089/stats/requests | grep -m 1 state
  "state": "stopped",

@cyberw
Copy link
Collaborator

cyberw commented Aug 21, 2020

Are you using load shapes? (just guessing here, I dont know this part of the code base so you're on your own :)

@max-rocket-internet
Copy link
Contributor Author

Are you using load shapes?

Nope

@max-rocket-internet
Copy link
Contributor Author

I'm in a good position to debug it myself but thought I'd post an issue in case others had the same problem.

@mboutet
Copy link
Contributor

mboutet commented Aug 24, 2020

I also experienced this a few times and came to the conclusion that it was happening when the workers were overloaded (high CPU usage). In my case, the workers were performing some blocking tasks on start (random texts generation). To prevent this, I limited the amount of users per worker and I also sprinkled some gevent.sleep(0) in the blocking code so that the event loop is not completely blocked.

@cyberw
Copy link
Collaborator

cyberw commented Sep 5, 2020

@max-rocket-internet did you manage to figure it out?

@max-rocket-internet
Copy link
Contributor Author

did you manage to figure it out?

Not yet! But we are seeing this every day. We might need to role back to a previous version. I'm still debugging it.

@cyberw
Copy link
Collaborator

cyberw commented Sep 21, 2020

Perhaps it only happens with a large number of workers? Were you using an on_stop method? (in that case, try running latest master with the above mentioned fix)

Without more details I think I'll have to close this.

@max-rocket-internet
Copy link
Contributor Author

Were you using an on_stop method?

I just checked all our tests, none of them use on_stop.

I'm still looking into. I can reproduce it only once and a while 😐

@cyberw
Copy link
Collaborator

cyberw commented Sep 24, 2020

I'm not sure if there are any debug-loggings surrounding this logic, but I'd recommend running locust with -L DEBUG, and checking the log output when the issue occurs (possibly adding some loggings if the existing ones are not enough)

@max-rocket-internet
Copy link
Contributor Author

Yeah I added --loglevel DEBUG but it's not enough. I then added my own debug logging but still can't reproduce it reliably.

@cyberw
Copy link
Collaborator

cyberw commented Sep 25, 2020

sneaky...

@max-rocket-internet
Copy link
Contributor Author

Hmm got debug logs from today but doesn't look like they help much:

[2020-10-14 11:58:10,144] locust-xxx-master-547544d45d-cq2gb/INFO/locust.main: Starting web interface at http://0.0.0.0:8089 (accepting connections from all network interfaces)
[2020-10-14 11:58:10,152] locust-xxx-master-547544d45d-cq2gb/INFO/locust.main: Starting Locust 1.2.3
[2020-10-14 11:58:13,443] locust-xxx-master-547544d45d-cq2gb/INFO/locust.runners: Client 'locust-xxx-worker-5b949b4c64-s2ngr_d1b34715c5054995a007b260f1b7db52' reported as ready. Currently 1 clients ready to swarm.
...
[2020-10-14 11:58:14,470] locust-xxx-master-547544d45d-cq2gb/INFO/locust.runners: Client 'locust-xxx-worker-5b949b4c64-n66h7_54b1cb9170134ab4a7282763ccaa782c' reported as ready. Currently 75 clients ready to swarm.
[2020-10-14 11:58:47,101] locust-xxx-master-547544d45d-cq2gb/INFO/locust.runners: Shape test starting. User count and spawn rate are ignored for this type of load test
[2020-10-14 11:58:47,101] locust-xxx-master-547544d45d-cq2gb/DEBUG/locust.runners: Updating state to 'ready', old state was 'ready'
[2020-10-14 11:58:47,102] locust-xxx-master-547544d45d-cq2gb/INFO/locust.runners: Shape worker starting
[2020-10-14 11:58:47,102] locust-xxx-master-547544d45d-cq2gb/INFO/locust.runners: Shape test updating to 25000 users at 13.00 spawn rate
[2020-10-14 11:58:47,102] locust-xxx-master-547544d45d-cq2gb/INFO/locust.runners: Sending spawn jobs of 333 users and 0.17 spawn rate to 75 ready clients
[2020-10-14 11:58:47,102] locust-xxx-master-547544d45d-cq2gb/DEBUG/locust.runners: Sending spawn message to client locust-xxx-worker-5b949b4c64-s2ngr_d1b34715c5054995a007b260f1b7db52
...
[2020-10-14 11:58:47,109] locust-xxx-master-547544d45d-cq2gb/DEBUG/locust.runners: Sending spawn message to client locust-xxx-worker-5b949b4c64-n66h7_54b1cb9170134ab4a7282763ccaa782c
[2020-10-14 11:58:47,110] locust-xxx-master-547544d45d-cq2gb/DEBUG/locust.runners: Updating state to 'spawning', old state was 'ready'
[2020-10-14 11:58:47,110] locust-xxx-master-547544d45d-cq2gb/DEBUG/locust.runners: Updating state to 'stopped', old state was 'spawning'

🤔

@max-rocket-internet
Copy link
Contributor Author

Maybe #1726 fixes this also 🤔

@cyberw
Copy link
Collaborator

cyberw commented Mar 16, 2021

Hmm... maybe. Ok to close as invalid? We can reopen if it was unrelated. And if it is the same, then the info in the other ticket is more useful anyway.

@max-rocket-internet
Copy link
Contributor Author

Ok to close as invalid? We can reopen if it was unrelated

Yeah sure. I still see it some days. I tried many times to reproduce it, no luck 😐

@cyberw cyberw added the invalid label Mar 16, 2021
@cyberw cyberw closed this as completed Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants