Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/8.0-staging] [mono] Fix deadlock in static constructor initialization #93943

Merged

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Oct 24, 2023

Backport of #93875 to release/8.0-staging

/cc @lambdageek

Fixes #93778

Customer Impact

Certain cyclic uses of static constructors (where a static constructor for one class may invoke a chain of static constructor calls that eventually require the static constructor of the original class to be called) between multiple threads that proceed in different order may cause the runtime to deadlock and for the app to become non-responsive. This affects mobile and WebAssembly apps.

Testing

Local testing and new CI tests

Risk

Medium. This PR replaces a concurrent wait without a time limit by a time-limited wait that followed by restarting the search for a cycle between the static constructors. In principle, this should only affect code that was previously deadlocked (and that will now not deadlock). However it could also have some unforseen effect on static constructors that previously took a long time (longer than the newly-introduced timeout) - it is possible that the new mechanism may now allow concurrent execution that was not previously allowed. This may have lead to unexpected behavior in user apps or first- and third-party libraries and frameworks.

lambdageek and others added 6 commits October 25, 2023 12:54
If two threads (A and B) need to call the static constructors for 3
classes X, Y, Z in this order:

Thread A: X, Z, Y
Thread B: Y, Z, X

where the cctor for X in thread A invokes the cctor for Z and for Y, and the
cctor for Y in thread B invokes the cctor for Z and X, then we can get
into a situation where A and B both start the cctors for X and Y (so
they will be in the type_initialization_hash for those two classes)
and then both will try to init Z.  In that case it could be that A
will be responsible for initializing Z and B will block.  Then A could
finish initializing Z but B may not have woken up yet (and so it will
be in blocked_thread_hash waiting for Z).  At that point A (who is at
this point already need to init Y) may think that it can wait for B to
finish initializing Y.  That is we get to
`mono_coop_cond_timedwait (&lock->cond, &lock->mutex, timeout_ms)`
with "lock" being the lock for `Y`.  But in fact thread B will not be
able to complete initializing Y because it will attempt to init X next
- but meanwhile we did not indicate that thread A is blocked.
As a result in thread A the timed wait will eventually timeout.  And
in this case we need to go back to the top and now correctly detect
that A is waiting for Y and B is waiting for X.  (At that point there
is a cctor deadlock and ECMA rules allow one of the threads to return
without calling the cctor)

The old code here used to do an infinite wait:
  while (!lock->done)
    mono_coop_cond_wait (&lock->cond, &lock->mutex)

This cannot succeed because "lock" (in thread A it's the lock for Y)
will not be signaled since B (who is supposed to init Y) will instead
block on the cctor for X.

Fixes #93778
@carlossanlop
Copy link
Member

/azp run

@azure-pipelines
Copy link

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list.

@carlossanlop
Copy link
Member

Closed and reopened to re-run the CI because the base branch had a lot of wasm infra failures that have since been fixed.

@lambdageek
Copy link
Member

@vargaz can I get a review please

@simonrozsival simonrozsival added Servicing-consider Issue for next servicing release review and removed Servicing-consider Issue for next servicing release review labels Nov 8, 2023
@carlossanlop
Copy link
Member

Friendly reminder: If you'd like this to be included in the December release, please merge it before Tuesday November 14th EOD (Code Complete).

@lambdageek lambdageek added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Nov 14, 2023
@lambdageek
Copy link
Member

Approved by tactics via email

@lambdageek lambdageek merged commit dc944d5 into release/8.0-staging Nov 14, 2023
161 of 164 checks passed
@akoeplinger akoeplinger deleted the backport/pr-93875-to-release/8.0-staging branch November 14, 2023 13:52
@carlossanlop carlossanlop modified the milestones: 8.0.x, 8.0.1 Nov 16, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Dec 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-VM-meta-mono Servicing-approved Approved for servicing release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants