Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed deadlock and crash when syncing with full dataset on Windows #8810

Merged
merged 1 commit into from
Apr 3, 2023

Conversation

SChernykh
Copy link
Contributor

@SChernykh SChernykh commented Mar 29, 2023

It's not allowed to use WaitForSingleObject with _beginthread, because the thread closes its own handle before exiting.

So the wait function will either wait on an invalid handle, or on a different handle used by something else.

Or, if it starts waiting before the thread exits, the behavior is undefined according to MS: "If this handle is closed while the wait is still pending, the function's behavior is undefined."

In my test sync I observed threads getting stuck infinitely on WaitForSingleObject, and then rx_set_main_seedhash spamming new threads when RandomX seed changes again. Eventually the system ran out of resources, and monerod aborted with "Couldn't start RandomX seed thread" message. See the attached screenshot of how it looked like before it crashed (note number of threads and handles).
image

This PR fixes it by using _beginthreadex instead and explicitly closing the handle when it's safe.

It's not allowed to use WaitForSingleObject with _beginthread, because the thread closes its own handle before exiting.

So the wait function will either wait on an invalid handle, or on a different handle used by something else.

Or, if it starts waiting before the thread exits, the behavior is undefined according to MS: "If this handle is closed while the wait is still pending, the function's behavior is undefined."

In my test sync I observed threads getting stuck infinitely on WaitForSingleObject, and then rx_set_main_seedhash spamming new threads when RandomX seed changes again. Eventually the system ran out of resources, and monerod aborted with "Couldn't start RandomX seed thread" message.

This PR fixes it by using `_beginthreadex` instead and explicitly closing the handle when it's safe.
@SChernykh
Copy link
Contributor Author

@hyc you should review this.

@hyc
Copy link
Collaborator

hyc commented Mar 29, 2023

Looks good.

Copy link

@krypt0x krypt0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks alright!

@luigi1111 luigi1111 merged commit 44ac52f into monero-project:master Apr 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants