Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential asyncio.create_task issue #7299

Closed
drew2a opened this issue Feb 21, 2023 · 3 comments · Fixed by #7300
Closed

Potential asyncio.create_task issue #7299

drew2a opened this issue Feb 21, 2023 · 3 comments · Fixed by #7300
Assignees
Milestone

Comments

@drew2a
Copy link
Contributor

drew2a commented Feb 21, 2023

The Tribler codebase could be affected by this bug:

If you have ever used asyncio.create_task you may have created a bug for yourself that is challenging (read almost impossible) to reproduce. If it occurs, your code will likely fail in unpredictable ways.

The root cause of this Heisenbug is that if you don't hold a reference to the task object returned by create_task then the task may disappear without warning when Python runs garbage collection. In other words, the code in your task will stop running with no obvious indication why.

https://textual.textualize.io/blog/2023/02/11/the-heisenbug-lurking-in-your-async-code/

@drew2a drew2a added this to the 7.13.0 milestone Feb 21, 2023
@drew2a drew2a self-assigned this Feb 21, 2023
@qstokkink
Copy link
Contributor

We have run into this in the past. However, if you schedule your calls using IPv8's TaskManager this is automatically taken care of for you.

@drew2a
Copy link
Contributor Author

drew2a commented Feb 21, 2023

@qstokkink yeap, thank you. For most asyncio/scheduling use cases in Tribler, we can use lightweight AsyncGroup: https://github.com/Tribler/tribler/blob/main/src/tribler/core/utilities/async_group.py

It also solves the described problem.

@kozlovsky
Copy link
Contributor

I want to add a note to the future, just in case:

I think the asyncio usage of weak references is a bug (that probably would not be fixed). The correct behavior for the loop is to become the task owner until the task is done.

I found this interesting description of the current asyncio behavior:

In the source code, when a task is created, it is immediately registered as ready in an internal running loop structure - that is a hard reference - when the loop cycles through on iteration and fails to call any task there, this hard reference is dropped. That this "dropping" is not deterministic maybe is a bug that could be fixed.

And the example that reproduces the bug is provided with the following comment

For anecdote, I experimented around, and the results are really nasty with random task-drops starting at around 2500 concurrent tasks with this code. Including an await asyncio.sleep(0) in the loop where the tasks are created makes it all run flawlessly up to millions of tasks.

If our current fix with AsyncGroup turns out to be insufficient, maybe we can also add some generic workaround with something like await asyncio.sleep(0) added to the right place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants