[mono] Fix deadlock in static constructor initialization #93875

lambdageek · 2023-10-23T15:51:55Z

If two threads (A and B) need to call the static constructors for 3 classes X, Y, Z in this order:

Thread A: X { Z, Y }
Thread B: Y { Z, X }

where the cctor for X in thread A invokes the cctor for Z and for Y, and the cctor for Y in thread B invokes the cctor for Z and X, then we can get into a situation where A and B both start the cctors for X and Y (so they will be in the type_initialization_hash for those two classes) and then both will try to init Z. In that case it could be that A will be responsible for initializing Z and B will block. Then A could finish initializing Z but B may not have woken up yet (and so it will be in blocked_thread_hash waiting for Z). At that point A (who is at this point already need to init Y) may think that it can wait for B to finish initializing Y. That is we get to
mono_coop_cond_timedwait (&lock->cond, &lock->mutex, timeout_ms) with "lock" being the lock for Y. But in fact thread B will not be able to complete initializing Y because it will attempt to init X next - but meanwhile we did not indicate that thread A is blocked. As a result in thread A the timed wait will eventually timeout. And in this case we need to go back to the top and now correctly detect that A is waiting for Y and B is waiting for X. (At that point there is a cctor deadlock and ECMA rules allow one of the threads to return without calling the cctor)

The old code here used to do an infinite wait:

  while (!lock->done)
    mono_coop_cond_wait (&lock->cond, &lock->mutex)

This cannot succeed because "lock" (in thread A it's the lock for Y) will not be signaled since B (who is supposed to init Y) will instead block on the cctor for X.

Fixes #93778

If two threads (A and B) need to call the static constructors for 3 classes X, Y, Z in this order: Thread A: X, Z, Y Thread B: Y, Z, X where the cctor for X in thread A invokes the cctor for Z and for Y, and the cctor for Y in thread B invokes the cctor for Z and X, then we can get into a situation where A and B both start the cctors for X and Y (so they will be in the type_initialization_hash for those two classes) and then both will try to init Z. In that case it could be that A will be responsible for initializing Z and B will block. Then A could finish initializing Z but B may not have woken up yet (and so it will be in blocked_thread_hash waiting for Z). At that point A (who is at this point already need to init Y) may think that it can wait for B to finish initializing Y. That is we get to `mono_coop_cond_timedwait (&lock->cond, &lock->mutex, timeout_ms)` with "lock" being the lock for `Y`. But in fact thread B will not be able to complete initializing Y because it will attempt to init X next - but meanwhile we did not indicate that thread A is blocked. As a result in thread A the timed wait will eventually timeout. And in this case we need to go back to the top and now correctly detect that A is waiting for Y and B is waiting for X. (At that point there is a cctor deadlock and ECMA rules allow one of the threads to return without calling the cctor) The old code here used to do an infinite wait: while (!lock->done) mono_coop_cond_wait (&lock->cond, &lock->mutex) This cannot succeed because "lock" (in thread A it's the lock for Y) will not be signaled since B (who is supposed to init Y) will instead block on the cctor for X. Fixes dotnet#93778

lambdageek · 2023-10-24T20:23:36Z

/backport to release/8.0-staging

github-actions · 2023-10-24T20:23:50Z

Started backporting to release/8.0-staging: https://github.com/dotnet/runtime/actions/runs/6632161553

lambdageek · 2023-10-24T20:37:51Z

I wonder if I need a test where a static cctor dependency takes a long time to unblock (longer than the timeout that I added here). Just to verify that I didn't break the ability for one thread to wait for another to complete.

src/mono/mono/metadata/object.c

lateralusX

LGTM! Implementation details looks reasonable and since PR just switched from a infinite wait to a timeout wait with a retry loop following the current patterns shouldn't introduce any new side effects.

lambdageek · 2023-10-25T12:52:15Z

/backport to release/8.0-staging

github-actions · 2023-10-25T12:52:30Z

Started backporting to release/8.0-staging: https://github.com/dotnet/runtime/actions/runs/6640645768

* [mono] Fix deadlock in static constructor initialization If two threads (A and B) need to call the static constructors for 3 classes X, Y, Z in this order: Thread A: X, Z, Y Thread B: Y, Z, X where the cctor for X in thread A invokes the cctor for Z and for Y, and the cctor for Y in thread B invokes the cctor for Z and X, then we can get into a situation where A and B both start the cctors for X and Y (so they will be in the type_initialization_hash for those two classes) and then both will try to init Z. In that case it could be that A will be responsible for initializing Z and B will block. Then A could finish initializing Z but B may not have woken up yet (and so it will be in blocked_thread_hash waiting for Z). At that point A (who is at this point already need to init Y) may think that it can wait for B to finish initializing Y. That is we get to `mono_coop_cond_timedwait (&lock->cond, &lock->mutex, timeout_ms)` with "lock" being the lock for `Y`. But in fact thread B will not be able to complete initializing Y because it will attempt to init X next - but meanwhile we did not indicate that thread A is blocked. As a result in thread A the timed wait will eventually timeout. And in this case we need to go back to the top and now correctly detect that A is waiting for Y and B is waiting for X. (At that point there is a cctor deadlock and ECMA rules allow one of the threads to return without calling the cctor) The old code here used to do an infinite wait: while (!lock->done) mono_coop_cond_wait (&lock->cond, &lock->mutex) This cannot succeed because "lock" (in thread A it's the lock for Y) will not be signaled since B (who is supposed to init Y) will instead block on the cctor for X. Fixes dotnet#93778 * Add test case * remove prototyping log messages * disable mt test on wasm * code review feedback

…alization (#93943) Backport of #93875 to release/8.0-staging * [mono] Fix deadlock in static constructor initialization If two threads (A and B) need to call the static constructors for 3 classes X, Y, Z in this order: Thread A: X, Z, Y Thread B: Y, Z, X where the cctor for X in thread A invokes the cctor for Z and for Y, and the cctor for Y in thread B invokes the cctor for Z and X, then we can get into a situation where A and B both start the cctors for X and Y (so they will be in the type_initialization_hash for those two classes) and then both will try to init Z. In that case it could be that A will be responsible for initializing Z and B will block. Then A could finish initializing Z but B may not have woken up yet (and so it will be in blocked_thread_hash waiting for Z). At that point A (who is at this point already need to init Y) may think that it can wait for B to finish initializing Y. That is we get to `mono_coop_cond_timedwait (&lock->cond, &lock->mutex, timeout_ms)` with "lock" being the lock for `Y`. But in fact thread B will not be able to complete initializing Y because it will attempt to init X next - but meanwhile we did not indicate that thread A is blocked. As a result in thread A the timed wait will eventually timeout. And in this case we need to go back to the top and now correctly detect that A is waiting for Y and B is waiting for X. (At that point there is a cctor deadlock and ECMA rules allow one of the threads to return without calling the cctor) The old code here used to do an infinite wait: while (!lock->done) mono_coop_cond_wait (&lock->cond, &lock->mutex) This cannot succeed because "lock" (in thread A it's the lock for Y) will not be signaled since B (who is supposed to init Y) will instead block on the cctor for X. Fixes #93778 * Add test case * remove prototyping log messages * disable mt test on wasm * better issues.target exclusion * code review feedback Co-authored-by: Aleksey Kliger <alklig@microsoft.com>

dotnet-issue-labeler bot added the area-VM-meta-mono label Oct 23, 2023

ghost assigned lambdageek Oct 23, 2023

lambdageek added area-VM-threading-mono and removed area-VM-meta-mono labels Oct 23, 2023

build-analysis bot mentioned this pull request Oct 23, 2023

Test failure: Wasm.Build.Tests.NativeBuildTests.MonoAOTCross_WorksWithNoTrimming #93522

Closed

Add test case

2194668

lambdageek marked this pull request as ready for review October 23, 2023 19:05

lambdageek requested review from vargaz and thaystg as code owners October 23, 2023 19:05

remove prototyping log messages

ddebe93

lambdageek requested review from BrzVlad and lateralusX October 23, 2023 20:30

disable mt test on wasm

5e4207d

lambdageek force-pushed the fix-gh-93778 branch from ae29ce2 to 5e4207d Compare October 24, 2023 15:37

better issues.target exclusion

1f55c2b

build-analysis bot mentioned this pull request Oct 24, 2023

Checkout failure: "Git fetch failed with exit code 128" dotnet/arcade#9009

Open

2 tasks

github-actions bot mentioned this pull request Oct 24, 2023

[release/8.0-staging] [mono] Fix deadlock in static constructor initialization #93943

Merged

vargaz reviewed Oct 25, 2023

View reviewed changes

src/mono/mono/metadata/object.c Outdated Show resolved Hide resolved

vargaz approved these changes Oct 25, 2023

View reviewed changes

lateralusX approved these changes Oct 25, 2023

View reviewed changes

code review feedback

56a260d

build-analysis bot mentioned this pull request Oct 25, 2023

MSBuild crashing in the build #92290

Open

lambdageek merged commit 3cd6455 into dotnet:main Oct 25, 2023
160 checks passed

ghost locked as resolved and limited conversation to collaborators Nov 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mono] Fix deadlock in static constructor initialization #93875

[mono] Fix deadlock in static constructor initialization #93875

lambdageek commented Oct 23, 2023 •

edited

Loading

lambdageek commented Oct 24, 2023

github-actions bot commented Oct 24, 2023

lambdageek commented Oct 24, 2023

lateralusX left a comment

lambdageek commented Oct 25, 2023

github-actions bot commented Oct 25, 2023

[mono] Fix deadlock in static constructor initialization #93875

[mono] Fix deadlock in static constructor initialization #93875

Conversation

lambdageek commented Oct 23, 2023 • edited Loading

lambdageek commented Oct 24, 2023

github-actions bot commented Oct 24, 2023

lambdageek commented Oct 24, 2023

lateralusX left a comment

Choose a reason for hiding this comment

lambdageek commented Oct 25, 2023

github-actions bot commented Oct 25, 2023

lambdageek commented Oct 23, 2023 •

edited

Loading