-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YogaNode Finalization is not thread safe #1271
Comments
Also, if you try to synchronize it by synchronizing it locally in each finalizer, it will not work. It has to be a global lock around all instances of YogaNodes or else it will still try to access the JNI finalization code at the same time and crash. |
That's a nasty issue. From my reading, finalizes runs on a dedicated GC thread, so it seems like any GC of Yoga nodes is unsafe. Apart from a global lock, I am first wondering that if The JNI bindings do already use the node context API to attach some Java specific information. So another option might be to keep information to allow redispatching destruction to the thread which allocated the node. Need to look closer at all the options here. Ideally we could avoid adding overhead. Also FYI @jkoritzinsky since this would impact the SafeHandle based finalizers in the C# port. |
Is there anything we can do on our side in the meantime? Do you have an idea of how long it would take to address/fix this and what priority it would be? |
QQ, does a lock on freeing a node here lock on all YogaNodes across all trees or does it just lock the one tree? |
I took a closer look at how memory is managed in the Java bindings, and would like to know more about how you are seeing concurrent modification happening on different threads. The Java bindings don’t offer a way to explicitly free a node, so the finalizer is always relied upon. A node on the Java side has references to both parent and children Java nodes, so if a Java Node is finalized, it is guaranteed that the native parent and children nodes also belong to dead Java nodes. Reading through your issue report again, are there multiple threads doing GC at once? So, multiple finalizers running on the same tree? Or are the edits happening some other way? If we know the entire connected tree is dead, we could probably make freeing an individual node thread safe by removing the steps to detach the node from the tree. I.e. YGNodeFree would just free() without interfering with any other connected nodes, and we know nothing will access the dangling pointers because everything connected is dead and being finalized. |
I can confirm this is definitely multiple threads GCing at once. These are some logs I printed out from System.err. These are as follows: ThreadId, mNativePointer, Root node's mNativePointer Starting Deletion: 9 139901460002576 139901460004720 I got the Root node's mNativePointer by doing this:
These logs start inside the if statement of the freeNatives function and FinishDeletion is at the end of the if statement. The logs above show that there are two threads (9 and 40) and two different nodes (139901460002576, 139901460001936) are being deleted which are from the same parent (139901460004720) In these logs we got lucky, but in crashes it fails. |
If you are able to locally validate a fix, would you be able to test the below patch? YOGA_EXPORT void YGNodeFree(const YGNodeRef node) {
- if (YGNodeRef owner = node->getOwner()) {
- owner->removeChild(node);
- node->setOwner(nullptr);
- }
-
- const uint32_t childCount = YGNodeGetChildCount(node);
- for (uint32_t i = 0; i < childCount; i++) {
- const YGNodeRef child = YGNodeGetChild(node, i);
- child->setOwner(nullptr);
- }
-
- node->clearChildren();
Event::publish<Event::NodeDeallocation>(node, {node->getConfig()});
delete node;
} That change cannot be landed as-is, but if it resolves the issue when using the finalizer there should be a straightforward path to a change which can be merged. |
This seems to work! I ran it a few thousand times and never got the flake to happen. |
Summary: Fixes facebook#1271 Java bindings for Yoga rely solely on garbage collection for memory management. Each Java `YogaNode` has references to its children and parent, along with an `YGNodeRef` which it owns. When the `YogaNode` is garbage collected, a finalizer is run to call `YGNodeFree` and free the underlying native Yoga Node. This may cause a use-after-free if finalizers are run from multiple threads. This is because `YGNodeFree` does more than just freeing, but instead also interacts with its parent and children nodes to detach itself, and remove any dangling pointers. If multiple threads run finalizers at once, one may traverse and try to mutate a node which another is freeing. This bit of behavior is counterintuitive from the API name anyway, so this diff makes a breaking change to `YGNodeFree` to instead mostly just be a `free()` (with a tracing step whose publishing should be thread safe). The existing behavior is exposed in an added function `YGNodeDetachAndFree()`. The majority of existing `YGNodeFree` usages will probably still work under the new behavior, but we don't have that many in fbsource (many instead call `YGNodeFreeRecursive` to free a whole tree at once). So for safety, I replaced existing usages of `YGNodeFree` outside of Yoga with `YGNodeDetachAndFree` to preserve existing behavior. JavaScript and C# bindings also use `YGNodeFree()`, and both allow explicit freeing. For now I switched both of their behavior to detachment, to avoid exposing the ability to managed languages to corrupt memory. Differential Revision: D45556206 fbshipit-source-id: 27a7c94ffea566e51b23cf3d7c48b94761624936
Summary: Fixes facebook/yoga#1271 Java bindings for Yoga rely solely on garbage collection for memory management. Each Java `YogaNode` has references to its children and parent, along with an `YGNodeRef` which it owns. When the `YogaNode` is garbage collected, a finalizer is run to call `YGNodeFree` and free the underlying native Yoga Node. This may cause a use-after-free if finalizers are run from multiple threads. This is because `YGNodeFree` does more than just freeing, but instead also interacts with its parent and children nodes to detach itself, and remove any dangling pointers. If multiple threads run finalizers at once, one may traverse and try to mutate a node which another is freeing. Because we know the entire connected tree is dead, there is no point to trying to remove dangling pointers, and this bit of behavior is counterintuitive from the API name anyway. This diff makes a breaking change to `YGNodeFree` to instead mostly just be a `free()` (with a tracing step whose publishing should be thread safe). The existing behavior is exposed in an added function `YGNodeDetachAndFree()`. The majority of existing `YGNodeFree` usages will probably still work under the new behavior, but we don't have that many in fbsource (many instead call `YGNodeFreeRecursive` to free a whole tree at once). So for safety, I replaced existing usages of `YGNodeFree` outside of Yoga with `YGNodeDetachAndFree` to preserve existing behavior. JavaScript and C# bindings also use `YGNodeFree()`, and both allow explicit freeing. For now I switched both of their behavior to detachment, to avoid exposing the ability to managed languages to corrupt memory. Differential Revision: D45556206 fbshipit-source-id: 0d1f52a4195c169c5ab43f6f417e22ce1f998d8c
Okay, I think the PR attached above should resolve the issue. |
Summary: Fixes facebook/yoga#1271 Java bindings for Yoga rely solely on garbage collection for memory management. Each Java `YogaNode` has references to its children and parent, along with an `YGNodeRef` which it owns. When the `YogaNode` is garbage collected, a finalizer is run to call `YGNodeFree` and free the underlying native Yoga Node. This may cause a use-after-free if finalizers are run from multiple threads. This is because `YGNodeFree` does more than just freeing, but instead also interacts with its parent and children nodes to detach itself, and remove any dangling pointers. If multiple threads run finalizers at once, one may traverse and try to mutate a node which another is freeing. Because we know the entire connected tree is dead, there is no point to trying to remove dangling pointers, and this bit of behavior is counterintuitive from the API name anyway. This diff makes a breaking change to `YGNodeFree` to instead mostly just be a `free()` (with a tracing step whose publishing should be thread safe). The existing behavior is exposed in an added function `YGNodeDetachAndFree()`. The majority of existing `YGNodeFree` usages will probably still work under the new behavior, but we don't have that many in fbsource (many instead call `YGNodeFreeRecursive` to free a whole tree at once). So for safety, I replaced existing usages of `YGNodeFree` outside of Yoga with `YGNodeDetachAndFree` to preserve existing behavior. JavaScript and C# bindings also use `YGNodeFree()`, and both allow explicit freeing. For now I switched both of their behavior to detachment, to avoid exposing the ability to managed languages to corrupt memory. Differential Revision: D45556206 fbshipit-source-id: a60954260157e4de8551d97c71aaadd0f203c2fb
Heads up if you were using the last PR that I made a change after talking to folks, to avoid changing behavior of YGNodeFree for safety, and to scope raw deallocation to just JNI. Should be the same for Java users as tested previously, but without changing any existing C API behavior. |
Fixed with 3b088c3 |
Hi @NickGerleman, thanks for fixing this issue. If I read the PR correctly, it looks like finalizer calls jni_YGNodeDeallocateJNI which calls YGNodeFree which has the old behavior (modifying the tree and free-ing itself). Did we also want to change jni_YGNodeDeallocateJNI to call into YGNodeDeallocate instead of YGNodeFree? |
Ugghh, yes, that was left behind in the switch from the one revision to the other. Let me make a new change to fix this. |
Summary: Fixes facebook/yoga#1271 Updating this glue was missed in D45556206 when moving from one revision to the other... Differential Revision: D45780647 fbshipit-source-id: a21ac3227ab79e659d377ba644374c4dfb13d66b
Summary: Pull Request resolved: facebook#37388 Fixes facebook/yoga#1271 Updating this glue was missed in D45556206 when moving from one revision to the other... Reviewed By: yungsters Differential Revision: D45780647 fbshipit-source-id: 60ea526d995d7a8096960a286628587e4c57c9f2
Summary: Pull Request resolved: facebook#37388 Fixes facebook/yoga#1271 Updating this glue was missed in D45556206 when moving from one revision to the other... Reviewed By: yungsters Differential Revision: D45780647 fbshipit-source-id: 81b0159b198ff25a4de31313cb505b9b602308a4
Summary: Pull Request resolved: #37388 Fixes facebook/yoga#1271 Updating this glue was missed in D45556206 when moving from one revision to the other... Reviewed By: yungsters Differential Revision: D45780647 fbshipit-source-id: 4ca64bc9971d3e4697990e73b618a3dc91df259b
Summary: X-link: facebook/react-native#37388 Fixes facebook/yoga#1271 Updating this glue was missed in D45556206 when moving from one revision to the other... Reviewed By: yungsters Differential Revision: D45780647 fbshipit-source-id: 4ca64bc9971d3e4697990e73b618a3dc91df259b
Thank you Nick! |
Report
Issues and Steps to Reproduce
Create many YogaNodes. Finalize them on separate threads. If two try to free at the same exact time, it will fail because the finalization code is not thread safe.
Expected Behavior
To remove nodes from the vector in a thread safe way to avoid a crash in the JNI.
Actual Behavior
It crashed during finalization.
Link to Code
The code that isn't thread safe in the JNI: https://github.com/facebook/yoga/blob/v1.19.0/yoga/Yoga.cpp#L249
The code that isn't thread safe in the Java library: https://github.com/facebook/yoga/blob/v1.19.0/java/com/facebook/yoga/YogaNodeJNIFinalizer.java#L28
We were thinking this might need to be synchronized per tree, so we'd need to get the root of the tree?
https://github.com/facebook/yoga/blob/v1.19.0/java/com/facebook/yoga/YogaNode.java#L42
The text was updated successfully, but these errors were encountered: