Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash due to concurrent writes to frameCallbacks vector #3859

Merged
merged 2 commits into from
Dec 15, 2022

Conversation

kmagiera
Copy link
Member

Summary

This PR fixes crashes related to concurrent writes to a non-thread-safe frameCallbacks vector. The primary role of frameCallbacks vector was to facilitate requestAnimationFrame calls and hence it has not been designed to allow for other than the main thread to append their callbacks. However, after recent rewrite #3722 we introduced a new methods "scheduleOnUI" that was originally meant to interface with a thread-safe scheduler API but was later updated (see c8a77da) to use frameCallbacks in order for errors to be handled using the same code-path the rAF uses. That change introduces a bug in which we'd access and modify that vector from different threads which is undesirable. This PR reverts that change and since #3846 provides a better way of handling JS-errors there is no longer need for scheduleOnUI callbacks to go throught the requestAnimationFrame codepath.

Test plan

The easiest way to reproduce the crash we could find was by using BokehExample.tsx on Android. This would normally result in a crash after a couple of seconds of running. With the increased number of circles in that example (e.g. 400) the crash would be almost immediate. This change was tested on that example with 400 circles and the crash would not appear even after some time after launch (few minutes).

@Willham12
Copy link

Is this the same issue?

Fatal Exception: java.lang.RuntimeException: vector
       at com.swmansion.reanimated.Scheduler.triggerUI(Scheduler.java)
       at com.swmansion.reanimated.Scheduler$1.run(Scheduler.java:14)
       at com.swmansion.reanimated.Scheduler$2.runGuarded(Scheduler.java:6)
       at com.facebook.react.bridge.GuardedRunnable.run(GuardedRunnable.java)
       at android.os.Handler.handleCallback(Handler.java:942)
       at android.os.Handler.dispatchMessage(Handler.java:99)
       at android.os.Looper.loopOnce(Looper.java:226)
       at android.os.Looper.loop(Looper.java:313)
       at android.app.ActivityThread.main(ActivityThread.java:8741)
       at java.lang.reflect.Method.invoke(Method.java)
       at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:571)
       at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1067)

@kmagiera
Copy link
Member Author

@Willham12 could be. any way of reproducing it? what version do you run?

@Willham12
Copy link

@Willham12 could be. any way of reproducing it? what version do you run?

https://github.com/software-mansion/react-native-reanimated/releases/tag/3.0.0-rc.8

IDK I don't know where exactly it happened. Just have these weird logs in Crashlytics.

@Willham12
Copy link

At the same time i had also this error:

Crashed: com.apple.main-thread
0  ???                            0x1029753b0 (Missing)
1  MyApp                          0x1d044 std::__1::function<double (objc_object*)>::function(std::__1::function<double (objc_object*)> const&) + 440 (function.h:440)
2  MyApp                          0x7c5ce4 std::__1::vector<std::__1::function<void (double)>, std::__1::allocator<std::__1::function<void (double)> > >::vector(std::__1::vector<std::__1::function<void (double)>, std::__1::allocator<std::__1::function<void (double)> > > const&) + 749 (memory:749)
3  MyApp                          0x7c16e0 reanimated::NativeReanimatedModule::onRender(double) + 796 (vector:796)
4  MyApp                          0x7bb7cc invocation function for block in reanimated::createReanimatedModule(RCTBridge*, std::__1::shared_ptr<facebook::react::CallInvoker>)::$_5::operator()(std::__1::function<void (double)>, facebook::jsi::Runtime&) const + 220 (NativeProxy.mm:220)
5  MyApp                          0x7cf0c4 -[REANodesManager onAnimationFrame:] + 232 (REANodesManager.mm:232)
6  QuartzCore                     0x28f9c CA::Display::DisplayLink::dispatch_items(unsigned long long, unsigned long long, unsigned long long) + 756
7  QuartzCore                     0x155edc CA::Display::DisplayLink::dispatch_deferred_display_links(unsigned int) + 380
8  UIKitCore                      0x652740 _UIUpdateSequenceRun + 84
9  UIKitCore                      0xc99fd0 schedulerStepScheduledMainSection + 172
10 UIKitCore                      0xc9919c runloopSourceCallback + 92
11 CoreFoundation                 0xd5f54 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28
12 CoreFoundation                 0xe232c __CFRunLoopDoSource0 + 176
13 CoreFoundation                 0x66210 __CFRunLoopDoSources0 + 244
14 CoreFoundation                 0x7bba8 __CFRunLoopRun + 836
15 CoreFoundation                 0x80ed4 CFRunLoopRunSpecific + 612
16 GraphicsServices               0x1368 GSEventRunModal + 164
17 UIKitCore                      0x3a23d0 -[UIApplication _run] + 888
18 UIKitCore                      0x3a2034 UIApplicationMain + 340
19 MyApp                          0x75d0 main + 8 (main.m:8)
20 ???                            0x1c4af4960 (Missing)

@kmagiera
Copy link
Member Author

Yeah, this looks very much like something caused by the bug this PR fixes

@Willham12
Copy link

Another crash maybe related to this issue:

Crashed: com.apple.main-thread
0  MyApp                          0x1d03c std::__1::function<double (objc_object*)>::function(std::__1::function<double (objc_object*)> const&) + 440 (function.h:440)
1  MyApp                          0x7c5ce4 std::__1::vector<std::__1::function<void (double)>, std::__1::allocator<std::__1::function<void (double)> > >::vector(std::__1::vector<std::__1::function<void (double)>, std::__1::allocator<std::__1::function<void (double)> > > const&) + 749 (memory:749)
2  MyApp                          0x7c16e0 reanimated::NativeReanimatedModule::onRender(double) + 796 (vector:796)
3  MyApp                          0x7bb7cc invocation function for block in reanimated::createReanimatedModule(RCTBridge*, std::__1::shared_ptr<facebook::react::CallInvoker>)::$_5::operator()(std::__1::function<void (double)>, facebook::jsi::Runtime&) const + 220 (NativeProxy.mm:220)
4  MyApp                          0x7cf0c4 -[REANodesManager onAnimationFrame:] + 232 (REANodesManager.mm:232)
5  QuartzCore                     0x28f9c CA::Display::DisplayLink::dispatch_items(unsigned long long, unsigned long long, unsigned long long) + 756
6  QuartzCore                     0x3a8a4 display_timer_callback(__CFMachPort*, void*, long, void*) + 372
7  CoreFoundation                 0x7b820 __CFMachPortPerform + 176
8  CoreFoundation                 0x98d00 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 60
9  CoreFoundation                 0x9a908 __CFRunLoopDoSource1 + 520
10 CoreFoundation                 0x7c13c __CFRunLoopRun + 2264
11 CoreFoundation                 0x80ed4 CFRunLoopRunSpecific + 612
12 GraphicsServices               0x1368 GSEventRunModal + 164
13 UIKitCore                      0x3a23d0 -[UIApplication _run] + 888
14 UIKitCore                      0x3a2034 UIApplicationMain + 340
15 MyApp                          0x75d0 main + 8 (main.m:8)
16 ???                            0x1d1b9c960 (Missing)

kmagiera and others added 2 commits December 15, 2022 12:41
This PR changes the way we report errors in development. Previously we'd
use RCTLog native module which would result in a relatively ugly red
screen displaying an error. In addition the stack trace wouldn't be
symbolicated so it was difficult to reason about the root cause of the
problem when the crash happened on the UI runtime.

With this change we provide a symbolicated version of trace to ErroUtil
module which results in the crash being displayed in the same way as it
were to occur on the regular RN runtime.

The main motivation is to provide better guidance for developers about
the crashes on the UI JS runtime as well as to use a familiar UI for
displaying those.

In a nutshell, the method we use relies on an extended version of "eval"
for processing the javascript code when loading on the UI runtime. We
now expose `evalWithSourceUrl` that beside the code also allows for
assiging the source url that is then included in the traces, and
`evalWithSourceMap` (only on hermes) that allows to provide JSON encoded
source map that is then used by the javascript engine to symbolicate the
stack trace.

For JS engines that does not support source maps (JSC), what we do
instead, is that we provide the worklet hash as a part of the source
url, this allows us to recognize which worklet a given stack entry is
coming from and allows us to map the provided line from that worklet
into a line of the whole JS bundle. In order to do so, we now generate
an "error" object in the place where worklet is generated such that we
can get its position in the Javascript bundle.

Below is a summary of changes this PR makes:
1) Changes in plugin focus on removing location metadata (we now use a
string in a format "worklet_7263" where the number is worklet's hash),
adding source maps in a form of JSON encoded string, and adding new
error object to the worklet object which is used to remap worklet line
number into the bundle line number
2) For the latter, we added some additional logic that replaces entries
in the provided error and replaces "worklet_23746:16:2" with the bundle
URL along with the remapped line numbers
3) We now route all JS calls via a new method "guardCall" that adds a
catch statement and passes the exception back to the main JS runtime
where we use ErrorUtils module to trigger the default React Native's
LogBox
4) We extend hermes runtime by exposing `evalWithSourceMap` method
– this method is only added for debug builds and only on hermes.
5) On other runtimes we register additional global method
`evalWIthSourceURL` that makes it possible to provide URLs along the
code that needs to be evaluated.

Run method "something" from the following snippet and expect an error
that should result in a redbox being desplayed. Note that the presented
stack trace should have correct line numbers. See the expected result on
iOS under the snippet. On Android the errors aren't as nicely formatted
but should contain valid trace entries. This needs to be tested on both
JSC and Hermes configurations.

```
function makeWorklet() {
  return () => {
    'worklet';
    throw new Error('Randomly crashing');
  };
}

const crashRandomly = makeWorklet();

function anotherWorklet() {
  'worklet';
  crashRandomly();
}

function something() {
  runOnUI(() => {
    'worklet';
    anotherWorklet();
  })();
}
```

![simulator_screenshot_3CECE7D6-DA04-4661-93C3-C25EF4A19423](https://user-images.githubusercontent.com/726445/206585057-e3d74c5f-bd02-4e9b-a4ae-ea8cd6fbb709.png)

Co-authored-by: Tomek Zawadzki <tomasz.zawadzki@swmansion.com>
Co-authored-by: Krzysztof Piaskowy <krzysztof.piaskowy@swmansion.com>
Co-authored-by: Juliusz Wajgelt <49338439+jwajgelt@users.noreply.github.com>
@kmagiera kmagiera merged commit f8f7b2e into main Dec 15, 2022
@kmagiera kmagiera deleted the shareables_crash_fix branch December 15, 2022 13:33
fluiddot pushed a commit to wordpress-mobile/react-native-reanimated that referenced this pull request Jun 5, 2023
…-mansion#3859)

## Summary

This PR fixes crashes related to concurrent writes to a non-thread-safe
frameCallbacks vector. The primary role of frameCallbacks vector was to
facilitate requestAnimationFrame calls and hence it has not been
designed to allow for other than the main thread to append their
callbacks. However, after recent rewrite software-mansion#3722 we introduced a new
methods "scheduleOnUI" that was originally meant to interface with a
thread-safe scheduler API but was later updated (see
c8a77da) to use frameCallbacks in order
for errors to be handled using the same code-path the rAF uses. That
change introduces a bug in which we'd access and modify that vector from
different threads which is undesirable. This PR reverts that change and
since software-mansion#3846 provides a better way of handling JS-errors there is no
longer need for scheduleOnUI callbacks to go throught the
requestAnimationFrame codepath.

## Test plan

The easiest way to reproduce the crash we could find was by using
BokehExample.tsx on Android. This would normally result in a crash after
a couple of seconds of running. With the increased number of circles in
that example (e.g. 400) the crash would be almost immediate. This change
was tested on that example with 400 circles and the crash would not
appear even after some time after launch (few minutes).

Co-authored-by: Tomek Zawadzki <tomasz.zawadzki@swmansion.com>
Co-authored-by: Krzysztof Piaskowy <krzysztof.piaskowy@swmansion.com>
Co-authored-by: Juliusz Wajgelt <49338439+jwajgelt@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants