Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix segfault when passing DartOpaque through ffi boundaries #2259

Merged
merged 7 commits into from
Sep 11, 2024

Conversation

alexlapa
Copy link
Contributor

@alexlapa alexlapa commented Aug 28, 2024

Changes

So, ive encountered a segfault on Android devices after upgrading to frb v2.2.

F/libc    (26466): Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x7fffffffffffffff in tid 26639 (1.ui), pid 26466
(a_jason_example)
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'Redmi/rosemary_eea/rosemary:13/TP1A.220624.014/V14.0.11.0.TKLEUXM:user/release-keys'
Revision: '0'
ABI: 'arm64'
Timestamp: 2024-08-28 04:17:38.660867792+0300
Process uptime: 7s
ZygotePid: 848
Cmdline: com.instrumentisto.medea_jason_example
pid: 26466, tid: 26639, name: 1.ui  >>> com.instrumentisto.medea_jason_example <<<
uid: 10320
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x7fffffffffffffff
    x0  7fffffffffffffff  x1  b400007859d20b00  x2  7fffffffffffffff  x3  0000000000000014
    x4  000000760000a431  x5  0000007859ce5200  x6  00000077fa0bb400  x7  00000077fa0bb000
    x8  7fffffffffffffff  x9  fffffffffffffff0  x10 0000000000000010  x11 0000000000000008
    x12 0000007859ce3830  x13 000000000781a8e0  x14 000000000781a4e0  x15 00000077ed6de080
    x16 0000000000000000  x17 b4000077fa075198  x18 00000077eaefe000  x19 000000785b4adddc
    x20 0000007606259759  x21 b4000077fa074a00  x22 0000007600008081  x23 00000076062596d9
    x24 0000007600008081  x25 00000077ed4fd000  x26 b4000077fa074a00  x27 0000007606592f70
    x28 0000000800000076  x29 00000077ed6de098
    lr  00000077b9459ae4  sp  00000077ed6ddfb0  pc  00000077b945f6ac  pst 0000000080001000
backtrace:
      #00 pc 000000000079a6ac
      /data/app/~~ffeMhpu5eN-Cb-D2BNeVnw==/com.instrumentisto.medea_jason_example-XtxdlZivlameAOYXlt7bTQ==/base.apk
      #01 pc 0000000000794ae0
      /data/app/~~ffeMhpu5eN-Cb-D2BNeVnw==/com.instrumentisto.medea_jason_example-XtxdlZivlameAOYXlt7bTQ==/base.apk
      #02 pc 000000000078ba5c
      /data/app/~~ffeMhpu5eN-Cb-D2BNeVnw==/com.instrumentisto.medea_jason_example-XtxdlZivlameAOYXlt7bTQ==/base.apk
      #03 pc 000000000078e818
      /data/app/~~ffeMhpu5eN-Cb-D2BNeVnw==/com.instrumentisto.medea_jason_example-XtxdlZivlameAOYXlt7bTQ==/base.apk
      #04 pc 000000000078e7ec
      /data/app/~~ffeMhpu5eN-Cb-D2BNeVnw==/com.instrumentisto.medea_jason_example-XtxdlZivlameAOYXlt7bTQ==/base.apk
      (dart_opaque_rust2dart_decode+12)
      #05 pc 00000000000081d4  [anon:dart-code]

So whats going on is if a pointer does not fit in u32, it is being changed to u32::max right here which i was able to confirm with additional logging:

decodeDartOpaque toInt call 12970367442122235088 => 9223372036854775807

Another thing is that for some reason most targets tend to create pointer with values that fit into u32 so it works fine on macOs on Intel silicon, Linux and android x86_64 thats probably why this problem was able to stay under the radar.

Thats also possible that the same exact problem might happen in some other places, since passing pointers as usize is a pretty common thing i believe, but with this fix all my tests pass and app is working properly. However there might be some other scenarios that im not aware of (perhaps when using other codecs?).

Checklist

  • An issue to be fixed by this PR is listed above.
  • New tests are added to ensure new features are working. Please refer to this page to see how to add a test.
  • ./frb_internal precommit --mode slow (or fast) is run (it internal runs code generator, does auto formatting, etc).
  • If this PR adds/changes features, documentations (in the ./website folder) are updated.
  • CI is passing. Please refer to this page to see how to solve a failed CI.

Remark for PR creator

  • ./frb_internal --help shows utilities for development.
  • If fzyzcjy does not reply for a few days, maybe he just did not see it, so please ping him.

Copy link

welcome bot commented Aug 28, 2024

Hi! Thanks for opening this pull request! 😄

@fzyzcjy
Copy link
Owner

fzyzcjy commented Aug 28, 2024

Great catch!

Btw, I wonder what platform are you using? Dart int is 64bit (Web is 32bit; but the pointers are also 32bit on web), so in non-web platforms the conversion seems should not convert to u32::max.

EDIT: I see the problem: It is truncated to 2^63 - 1. i.e. the raw pointer exceeds i64 (but within range of u64), and since Dart int is indeed i64, it is saturated.

(If you need I can provide some suggestions for e.g. what precommit subcommand to run to make CI happy)

@alexlapa
Copy link
Contributor Author

alexlapa commented Sep 2, 2024

(If you need I can provide some suggestions for e.g. what precommit subcommand to run to make CI happy)

Yeah, i've seen that precommit script but it wasnt working for me until i've rolled back local rust and flutter versions.

And i would like to correct myself - BigInt.toInt() is not a problem and it works as expected (0xffffffff.toInt() == -1), its this toUnsigned(64) that breaks pointers. So, i believe this fix can be reworked to continue using usize for pointers, so let me know if you think this is the right way.

Another thing that bothers me is that there is a pretty high possibility of a similar issue happening in some other cases, since passing pointers as usize is a pretty common thing, and all usizes are treated with toUnsigned(64) on the Dart side it seems. Maybe there is a same issue with the dco codec?

Other than that i believe its ready for review.

@alexlapa alexlapa marked this pull request as ready for review September 2, 2024 08:12
@fzyzcjy
Copy link
Owner

fzyzcjy commented Sep 2, 2024

Great job!

Yes, I agree it is the case when numbers is bigger than isize (still within usize range).

Another thing that bothers me is that there is a pretty high possibility of a similar issue happening in some other cases, since passing pointers as usize is a pretty common thing, and all usizes are treated with toUnsigned(64) on the Dart side it seems. Maybe there is a same issue with the dco codec?

That looks reasonable. To test cst & dco codec, you can (temporarily) set full_dep: true in config.

Btw, is it possible we somehow add some tests for this scenario? Then we can ensure it never regress in the future by using CI, and we will also auto test the DCO codec etc.

Another btw: It seems https://api.dart.dev/stable/3.5.2/dart-ffi/Pointer/address.html, i.e. a pointer is of type int (i64 on non-web). Thus curious what will happen when Dart sees an address like your case which is larger than maximum of i64...

Too busy and a bit tired now, thus will try to review this PR within a day :)

@alexlapa
Copy link
Contributor Author

alexlapa commented Sep 3, 2024

To test cst & dco codec, you can (temporarily) set full_dep: true in config.

So i've run some tests and returning DartOpaque from Rust to Dart works fine using dco even without this PR, so its not affected.

Btw, is it possible we somehow add some tests for this scenario?

Meh, i dont think there is an easy way to do this.

Running test on CI on a arm64-v8a Android will indirectly check for this issue, but thats not really a specific test. And that might be problematic since AFAIK there are some limitations of android emulator, i believe you cant run arm Android on x86 host since 27(?) API. Running that on a MacOS with Apple silicone might do the job but i never checked that.

Another option is implementing allocator(wrapping default one?) that will always return pointers that are bigger then i64::MAX. But im not really sure how to do that correctly.

Another option is not using MirTypePrimitive::Isize or MirTypePrimitive::Usize for this but maybe adding MirTypePrimitive::Ptr so we might add some tests for this type that will just check the hardcoded values. Like return u64::MAX from Rust and assert that its -1 in Dart. This way it would also make *const () a correct return for the Rust exported functions, which would be pretty cool.

Thus curious what will happen when Dart sees an address like your case which is larger than maximum of i64

So web is out of the question since its 32-bit. And io will be fine since u64 can be represented as i64. E.g: Rust will return u64::MAX, it will be translated to -1 in Dart, and translated back to the correct value when its back in Rust. As i previously mentioned it is toUnsigned(64).toInt() that is the problem:

0xffffffffffffffff.toInt() = -1
0xffffffffffffffff.b.toUnsigned(64) = 18446744073709551615
0xffffffffffffffff.b.toUnsigned(64).toInt() = 9223372036854775807

@fzyzcjy
Copy link
Owner

fzyzcjy commented Sep 3, 2024

So i've run some tests and returning DartOpaque from Rust to Dart works fine using dco even without this PR, so its not affected.

Great!

Meh, i dont think there is an easy way to do this.

I also think so... The environment is not trivial to create, and the unit tests may not cover wide enough.

it will be translated to -1 in Dart, and translated back to the correct value when its back in Rust

Yes, I guess it would be great for the doc to mention something about that (though surely optional)

(I will review the PR in detail probably tonight)

Copy link
Owner

@fzyzcjy fzyzcjy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, i believe this fix can be reworked to continue using usize for pointers, so let me know if you think this is the right way.

I guess that would have some benefits:

  • The address is unsigned, so usize looks intuitive
  • Other parts of the code often use usize (not isize) for addresses, so this aligns with those parts
  • Also, PR size is smaller, making the change clearer

Thus if you like to continue using usize, I am happy to choose that way!

@fzyzcjy
Copy link
Owner

fzyzcjy commented Sep 9, 2024

Btw, feel free to ping me (or use request a review button) when ready!

@fzyzcjy
Copy link
Owner

fzyzcjy commented Sep 10, 2024

Great! I will probably review within 24hours

Copy link
Owner

@fzyzcjy fzyzcjy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Only a nit (other 4 are pointers to point out specific line) and I am ready to merge

@fzyzcjy fzyzcjy merged commit 6fa3179 into fzyzcjy:master Sep 11, 2024
135 checks passed
Copy link

welcome bot commented Sep 11, 2024

Hi! Congrats on merging your first pull request! 🎉

@fzyzcjy
Copy link
Owner

fzyzcjy commented Sep 11, 2024

Great job!

@fzyzcjy
Copy link
Owner

fzyzcjy commented Sep 11, 2024

@all-contributors please add @alexlapa for code

Copy link
Contributor

@fzyzcjy

I've put up a pull request to add @alexlapa! 🎉

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants