Tests sometimes hang #61

Marenz · 2024-09-30T13:32:11Z

What happened?

Sometimes the tests hang. Here is a log of the problem: https://gist.github.com/Marenz/1ace8c7c0ccf01db70ceee8f767bb6f9#file-different-eventloop-py-L188

Notes:

Thread is always the same when the problem appears
Possibly related to channels / Timer?
Works when using the old event_loop replacement method

What did you expect instead?

.

Affected version(s)

No response

Affected part(s)

Build script, CI, dependencies, etc. (part:tooling)

Extra information

.

llucax · 2024-10-01T07:41:10Z

This started when merging pull request #54 introduced a timing issue with tests, making them flaky in amd64 but probably consistently failing in arm64 because the CI runs on qemu, which is extremely sloooooowwww.

Possibly related to channels / Timer?
The channels dependency was bumped in the mentioned PR via the bump of client-dispatch.

The issue seems to be that some condition variable is run in a different loop than the one it was created:

    | RuntimeError: <asyncio.locks.Condition object at 0x7f3d2bac0e50 [unlocked]> is bound to a different event loop

The error seems to always happen (at least the error about using the wrong loop) inside the clean-up code from select(), it might help adding some logging there, like printing a stack trace when the select() was created and when it is being cleaned-up to see if both actions are done in different tests (and different loops).

llucax · 2024-10-01T07:41:32Z

There seems to be a --setup-show flag that might help checking if we don't somehow have multiple loops overlapping.

Reading a bit more about pytest-asyncio, it seems there have been a few big changes in how this library is supposed to work between 0.21, 0.23 and 0.24. Probably all our code was written using the API 0.21 or older, so maybe issues might be connected to the upgrade to 0.24, although that was done one month ago so I'm not sure it's that likely. But maybe this issue is just surfacing some misconfiguration or misused of 0.24.

Here are official migration guides, it might be worth checking them and making sure we are using pytest-asyncio properly:

Marenz added priority:❓ We need to figure out how soon this should be addressed type:bug Something isn't working labels Sep 30, 2024

keywordlabeler bot added the part:tooling Affects the development tooling (CI, deployment, dependency management, etc.) label Sep 30, 2024

llucax added the part:tests Affects the unit, integration and performance (benchmarks) tests label Oct 1, 2024

llucax assigned Marenz and llucax Oct 1, 2024

llucax added priority:high Address this as soon as possible and removed priority:❓ We need to figure out how soon this should be addressed labels Oct 1, 2024

This comment was marked as outdated.

Sign in to view

llucax closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2024

llucax reopened this Oct 1, 2024

This was referenced Oct 1, 2024

Flaky hanging tests after merging #54 #66

Closed

Bump types-setuptools from 74.0.0.20240831 to 75.1.0.20240917 frequenz-floss/frequenz-sdk-python#1084

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests sometimes hang #61

Tests sometimes hang #61

Marenz commented Sep 30, 2024

llucax commented Oct 1, 2024

llucax commented Oct 1, 2024

This comment was marked as outdated.

Tests sometimes hang #61

Tests sometimes hang #61

Comments

Marenz commented Sep 30, 2024

What happened?

What did you expect instead?

Affected version(s)

Affected part(s)

Extra information

llucax commented Oct 1, 2024

llucax commented Oct 1, 2024

This comment was marked as outdated.