Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change FileSystemUtils to use dotnet nuget #3

Merged
merged 1 commit into from
Nov 23, 2021

Conversation

dellis1972
Copy link

No description provided.

@grendello grendello merged commit 6020be5 into grendello:dotnet-nuget Nov 23, 2021
@dellis1972 dellis1972 deleted the dotnet-nuget branch November 23, 2021 14:25
grendello pushed a commit that referenced this pull request Jan 19, 2022
Commit 6eb11f1 added support for API-32, while keeping the .NET 6
default `$(TargetFramework)` value as `net6.0-android31.0`:

> However, we don't want to change the default API level for .NET 6
> projects; the default will remain `net6.0-android31.0` (API-31),

This appears to have had some unforeseen complications: we would use
the API-31 `Mono.Android.dll`, with the API-32 `libmonodroid.so`/etc.
runtime libraries.  This in turn appears to be responsible for some
crashes we've seen on CI ever since commit c227042 when running the
`Mono.Android.NET_Tests` unit tests under .NET 6 with the interpreter
enabled, because `libxamarin-app.so` and `libmonodroid.so` have ABI
dependencies:

	DOTNET  : JNI_OnLoad: JNI_OnLoad in pal_jni.c
	libc    : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x2 in tid 3666 (droid.NET_Tests), pid 3666 (droid.NET_Tests)
	crash_dump64: performing dump of process 3666 (target tid = 3666)
	DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
	DEBUG   : Build fingerprint: 'Android/sdk_phone_x86_64/generic_x86_64:10/QPP6.190730.005.B1/5775370:userdebug/test-keys'
	DEBUG   : Revision: '0'
	DEBUG   : ABI: 'x86_64'
	DEBUG   : Timestamp: 2022-01-18 16:53:04+0000
	DEBUG   : pid: 3666, tid: 3666, name: droid.NET_Tests  >>> Mono.Android.NET_Tests <<<
	DEBUG   : uid: 10105
	DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x2
	DEBUG   : Cause: null pointer dereference
	DEBUG   :     rax 0000000000002b36  rbx 000078c8afb3f860  rcx 000078c98b6561f0  rdx 0000000000000000
	DEBUG   :     r8  0000000000000002  r9  0000000000000080  r10 000078c98b296080  r11 000078c987d35178
	DEBUG   :     r12 00007fffe46ae99c  r13 000078c8afb89ea0  r14 000078c8afb3f990  r15 000078c98b6746c0
	DEBUG   :     rdi 000078c8afb3f860  rsi 0000000000000002
	DEBUG   :     rbp 0000000000000001  rsp 00007fffe46ae448  rip 000078c8afb1b31c
	main    : type=1400 audit(0.0:40): avc: granted { read } for name="u:object_r:net_dns_prop:s0" dev="tmpfs" ino=6642 scontext=u:r:untrusted_app_25:s0:c512,c768 tcontext=u:object_r:net_dns_prop:s0 tclass=file app=Mono.Android.NET_Tests
	DEBUG   : 
	DEBUG   : backtrace:
	DEBUG   :       #00 pc 000000000002c31c  /data/app/Mono.Android.NET_Tests-fbdZV696v1UeW3jUzJg9yg==/lib/x86_64/libmonodroid.so (xamarin::android::Util::monodroid_store_package_name(char const*)+12) (BuildId: 91fe7d9c6b30356fcfb8337b8541d0132df4f44a)
	DEBUG   :       #1 pc 0000000000025bbc  /data/app/Mono.Android.NET_Tests-fbdZV696v1UeW3jUzJg9yg==/lib/x86_64/libmonodroid.so (xamarin::android::internal::MonodroidRuntime::Java_mono_android_Runtime_initInternal(_JNIEnv*, _jclass*, _jstring*, _jobjectArray*, _jstring*, _jobjectArray*, _jobject*, _jobjectArray*, int, unsigned char, unsigned char)+652) (BuildId: 91fe7d9c6b30356fcfb8337b8541d0132df4f44a)
	DEBUG   :       #2 pc 00000000000273fb  /data/app/Mono.Android.NET_Tests-fbdZV696v1UeW3jUzJg9yg==/lib/x86_64/libmonodroid.so (Java_mono_android_Runtime_initInternal+75) (BuildId: 91fe7d9c6b30356fcfb8337b8541d0132df4f44a)
	DEBUG   :       #3 pc 0000000000174641  /apex/com.android.runtime/lib64/libart.so (art_quick_generic_jni_trampoline+209) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
	…

Thinking about it more, we should only need to use the API-31 "ref"
or "targeting" pack.  The "runtime" pack can just use the latest from
the workload.

To fix this:

 1. Create new `$(_AndroidTargetingPackId)` and
    `$(_AndroidTargetingPackVersion)` properties to use independently
    of the runtime pack version.

 2. Remove `Microsoft.Android.Runtime.31.[rid]` packs from the workload.

 3. Remove the `android-32` workload, as it should no longer be needed.
    The API 31 "ref" pack is fairly small and can go in the `android`
    workload.

Now our `android` workload is:

  * `Microsoft.Android.Sdk`
  * `Microsoft.Android.Ref.31`
  * `Microsoft.Android.Ref.32`
  * `Microsoft.Android.Runtime.32.android-arm`
  * `Microsoft.Android.Runtime.32.android-arm64`
  * `Microsoft.Android.Runtime.32.android-x86`
  * `Microsoft.Android.Runtime.32.android-x64`
  * `Microsoft.Android.Templates`

After these changes, I get this assembly at build time:

	dotnet\packs\Microsoft.Android.Ref.31\31.0.101-preview.11.117\ref\net6.0\Mono.Android.dll

And this assembly at runtime:

	dotnet\packs\Microsoft.Android.Runtime.32.android-arm64\31.0.200-preview.13.21\runtimes\android-arm64\lib\net6.0\Mono.Android.dll

Additionally, CI is fully green; in particular, the
**APKs .NET - macOS** step is green, which hasn't been true on
xamarin-android/main since commit c227042.
grendello pushed a commit that referenced this pull request Jan 28, 2022
…6672)

Context: dotnet/maui#4262
Context: dotnet#6675

If you run the `maui-blazor` template in a Release build:

	dotnet build -t:Run -c Release

it crashes at runtime:

	D monodroid-assembly: typemap: type with token 33555274 (0x200034a) in module {C7B4CC8F-7A03-4A3F-A34A-DC66EDC548B9} (Mono.Android) corresponds to Java type 'android/runtime/JavaProxyThrowable'
	…
	F DEBUG   : backtrace:
	F DEBUG   : #00 pc 000000000065d8fc  /apex/com.android.art/lib64/libart.so (void art::StackVisitor::WalkStack<(art::StackVisitor::CountTransitions)0>(bool)+156) (BuildId: 7fbaf2a1a3317bd634b00eb90e32291e)
	F DEBUG   : #1 pc 000000000069b25d  /apex/com.android.art/lib64/libart.so (art::Thread::GetCurrentMethod(unsigned int*, bool, bool) const+157) (BuildId: 7fbaf2a1a3317bd634b00eb90e32291e)
	F DEBUG   : #2 pc 0000000000430fed  /apex/com.android.art/lib64/libart.so (art::JNI<false>::FindClass(_JNIEnv*, char const*)+765) (BuildId: 7fbaf2a1a3317bd634b00eb90e32291e)
	F DEBUG   : #3 pc 0000000000047e5a  /data/app/~~0Qm6D1S0sO3f1lwfakN0PA==/com.companyname.mauiapp2-08UokVCH5k_PlbZEH_hhkA==/split_config.x86_64.apk!libmono-android.release.so (offset 0x11e000) (java_interop_jnienv_find_class+26) (BuildId: 3d04f8b946590175e97b89aee2e3b19ceed4b524)
	F DEBUG   : dotnet#4 pc 00000000000128ac  <anonymous:41640000>

The crash can be avoided by disabling the linker:

	dotnet build -t:Run -c Release -p:AndroidLinkMode=None
	# -or-
	dotnet build -t:Run -c Release -p:PublishTrimmed=false

However, let us return to the crash: *why* is it crashing?
This isn't a "good debugging experience"; we have no useful context.

Lots of investigation later -- all hail printf debugging -- and we
found that the cause of the crash was an unhandled exception:

 1. `Mono.Android.dll` has it's Java Callable Wrappers generated
    from the *unlinked* assembly, into `mono.android.jar` and
    `mono.android.dex` files.  The Java Callable Wrapper for
    `Android.Runtime.InputStreamAdapter` thus includes *all*
    `Read()` method overloads.

 2. When the app is built in Release configuration, linking is
    enabled, and *some* of the `InputStreamAdapter.Read()` methods
    are removed by the linker, along with the
    `Java.IO.InputStream.Read()` methods that were overridden.

 3. At runtime, we perform [Java Type Registration][0] for the
    `Android.Runtime.InputStreamAdapter` type, which eventually calls
    `AndroidTypeManager.RegisterNativeMembers()`, which eventually
    attempts to *effectively* do:

        Delegate.CreateDelegate (
	        typeof(Func<Delegate>),
	        typeof(InputStreamAdapter),
	        "GetReadHandler");

 4. Because of (2), `Java.IO.InputStream.GetReadHandler()`
    *does not exist*, and thus `Delegate.CreateDelegate()` throws an
    `ArgumentException`.

So far, so reasonable, but…

 5. `AndroidTypeManager.RegisterNativeMembers()` didn't catch any
    exceptions, nor did any other method between the original Java
    `Runtime.register()` invocation and
    `AndroidTypeManager.RegisterNativeMembers()`.  The result is that
    a C# exception was "in flight", and Mono then proceeded to
    *tear down the stack frame* as it unwound the callstack looking
    for `catch` handlers.

At this point, the process is toast: the runtime stack is FUBAR.

This is also why the `backtrace:` is "rooted" in
`JNIEnv::FindClass()`: `JNIEnv::FindClass()` invokes Java static
constructors before returning, which is how the static constructor in
the Java Callable Wrapper for `InputStreamAdapter` called
`Runtime.register()` in the first place.

All of this makes for a miserable debugging experience.

Fixing the "original" linker issue will be done in
dotnet#6675.

This hasn't been an issue in "Classic" Xamarin.Android, presumably
because the classic linker isn't as good as the net6 linker.

What we want to do *here* is improve this debugging experience, by
"wrapping" `AndroidTypeManager.RegisterNativeMembers()` in a
`try`/`catch` block, which can then *marshal the thrown exception*
back to Java.  This *prevents* Mono from unwinding the callstack past
a JNI boundary, and avoids the annoying-to-debug app crash.

After this change, we get a much friendlier unhandled exception crash:

	I MonoDroid: Android.Runtime.JavaProxyThrowable: Exception_WasThrown, Android.Runtime.JavaProxyThrowable
	I MonoDroid:
	I MonoDroid:   --- End of managed Android.Runtime.JavaProxyThrowable stack trace ---
	I MonoDroid: android.runtime.JavaProxyThrowable: System.ArgumentException: Arg_DlgtTargMeth
	I MonoDroid:    at System.Delegate.CreateDelegate(Type , Type , String , Boolean , Boolean )
	I MonoDroid:    at System.Delegate.CreateDelegate(Type , Type , String )
	I MonoDroid:    at Android.Runtime.AndroidTypeManager.RegisterNativeMembers(JniType , Type , String )
	I MonoDroid: --- End of stack trace from previous location ---
	I MonoDroid:    at Java.Interop.JniEnvironment.StaticMethods.CallStaticObjectMethod(JniObjectReference , JniMethodInfo , JniArgumentValue* )
	I MonoDroid:    at Android.Runtime.JNIEnv.CallStaticObjectMethod(IntPtr , IntPtr , JValue* )
	I MonoDroid:    at Android.Runtime.JNIEnv.CallStaticObjectMethod(IntPtr , IntPtr , JValue[] )
	I MonoDroid:    at Android.Runtime.JNIEnv.FindClass(String )
	I MonoDroid:    at Android.Runtime.JNIEnv.AllocObject(String )
	I MonoDroid:    at Android.Runtime.JNIEnv.StartCreateInstance(String , String , JValue* )
	I MonoDroid:    at Android.Runtime.JNIEnv.StartCreateInstance(String , String , JValue[] )
	I MonoDroid:    at Android.Runtime.InputStreamAdapter..ctor(Stream )
	I MonoDroid:    at Android.Runtime.InputStreamAdapter.ToLocalJniHandle(Stream )
	I MonoDroid:    at Android.Webkit.WebResourceResponse..ctor(String , String , Int32 , String , IDictionary`2 , Stream )
	I MonoDroid:    at Microsoft.AspNetCore.Components.WebView.Maui.WebKitWebViewClient.ShouldInterceptRequest(WebView view, IWebResourceRequest request)
	I MonoDroid:    at Android.Webkit.WebViewClient.n_ShouldInterceptRequest_Landroid_webkit_WebView_Landroid_webkit_WebResourceRequest_(IntPtr , IntPtr , IntPtr , IntPtr )
	I MonoDroid: 	at crc64d693e2d9159537db.WebKitWebViewClient.n_shouldInterceptRequest(Native Method)
	I MonoDroid: 	at crc64d693e2d9159537db.WebKitWebViewClient.shouldInterceptRequest(WebKitWebViewClient.java:39)
	I MonoDroid: 	at Rr.a(chromium-TrichromeWebViewGoogle.apk-stable-410410686:16)
	I MonoDroid: 	at org.chromium.android_webview.AwContentsBackgroundThreadClient.shouldInterceptRequestFromNative(chromium-TrichromeWebViewGoogle.apk-stable-410410686:2)
	I MonoDroid:
	I MonoDroid:   --- End of managed Android.Runtime.JavaProxyThrowable stack trace ---
	I MonoDroid: android.runtime.JavaProxyThrowable: System.ArgumentException: Arg_DlgtTargMeth
	I MonoDroid:    at System.Delegate.CreateDelegate(Type , Type , String , Boolean , Boolean )
	I MonoDroid:    at System.Delegate.CreateDelegate(Type , Type , String )
	I MonoDroid:    at Android.Runtime.AndroidTypeManager.RegisterNativeMembers(JniType , Type , String )
	I MonoDroid: --- End of stack trace from previous location ---
	I MonoDroid:    at Java.Interop.JniEnvironment.StaticMethods.CallStaticObjectMethod(JniObjectReference , JniMethodInfo , JniArgumentValue* )
	I MonoDroid:    at Android.Runtime.JNIEnv.CallStaticObjectMethod(IntPtr , IntPtr , JValue* )
	I MonoDroid:    at Android.Runtime.JNIEnv.CallStaticObjectMethod(IntPtr , IntPtr , JValue[] )
	I MonoDroid:    at Android.Runtime.JNIEnv.FindClass(String )
	I MonoDroid:    at Android.Runtime.JNIEnv.AllocObject(String )
	I MonoDroid:    at Android.Runtime.JNIEnv.StartCreateInstance(String , String , JValue* )
	I MonoDroid:    at Android.Runtime.JNIEnv.StartCreateInstance(String , String , JValue[] )
	I MonoDroid:    at Android.Runtime.InputStreamAdapter..ctor(Stream )
	I MonoDroid:    at Android.Runtime.InputStreamAdapter.ToLocalJniHandle(Stream )
	I MonoDroid:    at Android.Webkit.WebResourceResponse..ctor(String , String , Int32 , String , IDictionary`2 , Stream )
	I MonoDroid:    at Microsoft.AspNetCore.Components.WebView.Maui.WebKitWebViewClient.ShouldInterceptRequest(WebView view, IWebResourceRequest request)
	I MonoDroid:    at Android.Webkit.WebViewClient.n_ShouldInterceptRequest_Landroid_webkit_WebView_Landroid_webkit_WebResourceRequest_(IntPtr , IntPtr , IntPtr , IntPtr )
	I MonoDroid: 	at crc64d693e2d9159537db.WebKitWebViewClient.n_shouldInterceptRequest(Native Method)
	I MonoDroid: 	at crc64d693e2d9159537db.WebKitWebViewClient.shouldInterceptRequest(WebKitWebViewClient.java:39)
	I MonoDroid: 	at Rr.a(chromium-TrichromeWebViewGoogle.apk-stable-410410686:16)
	I MonoDroid: 	at org.chromium.android_webview.AwContentsBackgroundThreadClient.shouldInterceptRequestFromNative(chromium-TrichromeWebViewGoogle.apk-stable-410410686:2)

This is much easier to reason about, and will save us time in
the future.

[0]: https://github.com/xamarin/xamarin-android/wiki/Blueprint#java-type-registration
grendello added a commit that referenced this pull request May 25, 2022
Context: dotnet/install-scripts#15
Context: https://dot.net/v1/dotnet-install.sh
Context: https://dot.net/v1/dotnet-install.ps1

We've been installing dotnet versions using the [`dotnet-install`][0]
scripts for Unix & Windows.  However, they do not cache the
downloaded archive, and therefore we end up re-downloading the same
archive over and over again.

Additionally, if one finds themselves without an internet connection,
there's no way to easily install the required version of dotnet.

The installation scripts don't provide a way to cache the payloads
and they appear to be in maintenance mode (dotnet/install-scripts#15),
so there doesn't appear to be a chance to add caching support to them.

Fortunately, we can "ask" the scripts what they're downloading:

	% curl -o dotnet-install.sh 'https://dot.net/v1/dotnet-install.sh'
	% ./dotnet-install.sh --version 7.0.100-preview.5.22273.1 --verbose --dry-run  \
	| grep 'dotnet-install: URL'

This returns a list of URLs, which may or may not exist:

	dotnet-install: URL #0 - primary: https://dotnetcli.azureedge.net/dotnet/Sdk/7.0.100-preview.5.22273.1/dotnet-sdk-7.0.100-preview.5.22273.1-osx-x64.tar.gz
	dotnet-install: URL #1 - legacy: https://dotnetcli.azureedge.net/dotnet/Sdk/7.0.100-preview.5.22273.1/dotnet-dev-osx-x64.7.0.100-preview.5.22273.1.tar.gz
	dotnet-install: URL #2 - primary: https://dotnetbuilds.azureedge.net/public/Sdk/7.0.100-preview.5.22273.1/dotnet-sdk-7.0.100-preview.5.22273.1-osx-x64.tar.gz
	dotnet-install: URL #3 - legacy: https://dotnetbuilds.azureedge.net/public/Sdk/7.0.100-preview.5.22273.1/dotnet-dev-osx-x64.7.0.100-preview.5.22273.1.tar.gz

We now parse this output, extract the URLs, then download and cache
the URL contents into `$(AndroidToolchainCacheDirectory)`.

When we need to install .NET, we just extract the cached archive
into the appropriate directory.

If no `dotnet-install: URL…` messages are generated, then we run
the `dotnet-install` script as we previously did.

This process lets us take a first step towards fully "offline" builds,
along with smaller downloads on CI servers.

[0]: https://docs.microsoft.com/en-us/dotnet/core/tools/dotnet-install-script
grendello added a commit that referenced this pull request Sep 15, 2022
    9-15 22:56:28.431 15070 15090 W monodroid: jclass java_interop_jnienv_find_class(JNIEnv *, jthrowable *, const char *) looking for 'android/content/Intent$ShortcutIconResource'
    --------- beginning of crash
    09-15 22:56:28.432 15070 15090 F libc    : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xbb60ca20 in tid 15090 (Instrumentation), pid 15070 (droid.NET_Tests)
    09-15 22:56:28.757 15103 15103 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
    09-15 22:56:28.757 15103 15103 F DEBUG   : Build fingerprint: 'google/raven/raven:13/TP1A.220624.021/8877034:user/release-keys'
    09-15 22:56:28.757 15103 15103 F DEBUG   : Revision: 'MP1.0'
    09-15 22:56:28.757 15103 15103 F DEBUG   : ABI: 'arm64'
    09-15 22:56:28.757 15103 15103 F DEBUG   : Timestamp: 2022-09-15 22:56:28.545823727+0200
    09-15 22:56:28.757 15103 15103 F DEBUG   : Process uptime: 3s
    09-15 22:56:28.757 15103 15103 F DEBUG   : Cmdline: Mono.Android.NET_Tests
    09-15 22:56:28.757 15103 15103 F DEBUG   : pid: 15070, tid: 15090, name: Instrumentation  >>> Mono.Android.NET_Tests <<<
    09-15 22:56:28.757 15103 15103 F DEBUG   : uid: 10638
    09-15 22:56:28.757 15103 15103 F DEBUG   : tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
    09-15 22:56:28.757 15103 15103 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x00000000bb60ca20
    09-15 22:56:28.757 15103 15103 F DEBUG   :     x0  b400006e8ea08350  x1  0000000000000001  x2  0000000000000001  x3  0000000000000010
    09-15 22:56:28.757 15103 15103 F DEBUG   :     x4  0000006dbaea7233  x5  0000006deea682be  x6  0000000000000020  x7  0000000000000020
    09-15 22:56:28.757 15103 15103 F DEBUG   :     x8  0000000000000008  x9  00000000bb60c9e0  x10 0000006deea682a7  x11 0000006deea682a7
    09-15 22:56:28.757 15103 15103 F DEBUG   :     x12 3d2120746e696f70  x13 7274706c6c756e20  x14 0000006d3d90b998  x15 000000a74ab3008e
    09-15 22:56:28.757 15103 15103 F DEBUG   :     x16 0000006dbb60f6b8  x17 0000007074e7b9ac  x18 0000006d394cc000  x19 0000006d3d90b6d8
    09-15 22:56:28.757 15103 15103 F DEBUG   :     x20 b400006e8ea08350  x21 0000000000000001  x22 0000006d3d90e000  x23 0000006d3d90b6d8
    09-15 22:56:28.757 15103 15103 F DEBUG   :     x24 0000006d3d90b830  x25 0000006d3d90bb80  x26 0000006e7ea14cf0  x27 b400006e8ea08350
    09-15 22:56:28.757 15103 15103 F DEBUG   :     x28 0000000000000000  x29 0000006d3d90b5d0
    09-15 22:56:28.757 15103 15103 F DEBUG   :     lr  0000006dbb16f3dc  sp  0000006d3d90b5b0  pc  0000006dbb171d44  pst 0000000080001000
    09-15 22:56:28.757 15103 15103 F DEBUG   : backtrace:
    09-15 22:56:28.757 15103 15103 F DEBUG   :       #00 pc 0000000000371d44  /apex/com.android.art/lib64/libart.so (art::ArtMethod::PrettyMethod(bool)+76) (BuildId: 56e704c544e6c624201be2ab4933e853)
    09-15 22:56:28.757 15103 15103 F DEBUG   :       #1 pc 000000000036f3d8  /apex/com.android.art/lib64/libart.so (void art::StackVisitor::WalkStack<(art::StackVisitor::CountTransitions)0>(bool)+5464) (BuildId: 56e704c544e6c624201be2ab4933e853)
    09-15 22:56:28.757 15103 15103 F DEBUG   :       #2 pc 00000000005a0370  /apex/com.android.art/lib64/libart.so (art::JNI<true>::FindClass(_JNIEnv*, char const*)+480) (BuildId: 56e704c544e6c624201be2ab4933e853)
    09-15 22:56:28.757 15103 15103 F DEBUG   :       #3 pc 00000000005c8948  /apex/com.android.art/lib64/libart.so (art::(anonymous namespace)::CheckJNI::FindClass(_JNIEnv*, char const*) (.__uniq.99033978352804627313491551960229047428.llvm.5591279935177935698)+228) (BuildId: 56e704c544e6c624201be2ab4933e853)
    09-15 22:56:28.757 15103 15103 F DEBUG   :       dotnet#4 pc 000000000004de28  /data/app/~~aJedheWQLPfk1ulUOfKVyg==/Mono.Android.NET_Tests-XvAL5W7BvZDwkEbYfmLTIQ==/lib/arm64/libmonodroid.so (java_interop_jnienv_find_class+84) (BuildId: a92c56b31adcb233a4674b5eb523c0aaa67a811d)
    09-15 22:56:28.757 15103 15103 F DEBUG   :       dotnet#5 pc 000000000000b220  <anonymous:705fcb1000>
grendello added a commit that referenced this pull request Nov 22, 2022
It won't fix the failure, but extra logging might be useful at some
point. The current failure is:

    droid.NET_Test: java_vm_ext.cc:570] JNI DETECTED ERROR IN APPLICATION: use of invalid jobject 0x7100941bb830
    backtrace:
           #00 pc 00000000000943f8  /apex/com.android.runtime/lib64/bionic/libc.so (syscall+24) (BuildId: a08a19770d6696739c847e29c3f5f650)
           #1 pc 0000000000097146  /apex/com.android.runtime/lib64/bionic/libc.so (abort+182) (BuildId: a08a19770d6696739c847e29c3f5f650)
           #2 pc 000000000055321f  /apex/com.android.runtime/lib64/libart.so (art::Runtime::Abort(char const*)+2399) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
           #3 pc 000000000000c873  /system/lib64/libbase.so (android::base::LogMessage::~LogMessage()+611) (BuildId: 40d2b536dbf0730fdc31abd2b469f94f)
           dotnet#4 pc 00000000003ede64  /apex/com.android.runtime/lib64/libart.so (art::JavaVMExt::JniAbort(char const*, char const*)+1604) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
           dotnet#5 pc 00000000003ee16a  /apex/com.android.runtime/lib64/libart.so (art::JavaVMExt::JniAbortF(char const*, char const*, ...)+234) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
           dotnet#6 pc 00000000005adf7b  /apex/com.android.runtime/lib64/libart.so (art::Thread::DecodeJObject(_jobject*) const+875) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
           dotnet#7 pc 00000000003def7b  /apex/com.android.runtime/lib64/libart.so (art::(anonymous namespace)::ScopedCheck::CheckInstance(art::ScopedObjectAccess&, art::(anonymous namespace)::ScopedCheck::InstanceKind, _jobject*, bool)+91) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
           dotnet#8 pc 00000000003de1e5  /apex/com.android.runtime/lib64/libart.so (art::(anonymous namespace)::ScopedCheck::CheckPossibleHeapValue(art::ScopedObjectAccess&, char, art::(anonymous namespace)::JniValueType)+485) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
           dotnet#9 pc 00000000003de2d4  /apex/com.android.runtime/lib64/libart.so (art::(anonymous namespace)::ScopedCheck::CheckPossibleHeapValue(art::ScopedObjectAccess&, char, art::(anonymous namespace)::JniValueType)+724) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
           dotnet#10 pc 00000000003dd712  /apex/com.android.runtime/lib64/libart.so (art::(anonymous namespace)::ScopedCheck::Check(art::ScopedObjectAccess&, bool, char const*, art::(anonymous namespace)::JniValueType*)+690) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
           dotnet#11 pc 00000000003e28c0  /apex/com.android.runtime/lib64/libart.so (art::(anonymous namespace)::CheckJNI::CheckCallArgs(art::ScopedObjectAccess&, art::(anonymous namespace)::ScopedCheck&, _JNIEnv*, _jobject*, _jclass*, _jmethodID*, art::InvokeType, art::(anonymous namespace)::VarArgs const*)+160) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
           dotnet#12 pc 00000000003e1a9e  /apex/com.android.runtime/lib64/libart.so (art::(anonymous namespace)::CheckJNI::CallMethodV(char const*, _JNIEnv*, _jobject*, _jclass*, _jmethodID*, __va_list_tag*, art::Primitive::Type, art::InvokeType)+910) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
           dotnet#13 pc 00000000003cf551  /apex/com.android.runtime/lib64/libart.so (art::(anonymous namespace)::CheckJNI::CallObjectMethod(_JNIEnv*, _jobject*, _jmethodID*, ...)+177) (BuildId: 8bb3225e7c408f2ca23abac3db0417f2)
           dotnet#14 pc 00000000000013ec  /data/app/Mono.Android.NET_Tests--O9vgexkYeCx3nX-AuvLTQ==/split_config.x86_64.apk!libreuse-threads.so (offset 0xa8d000) (BuildId: 562d86d81ebdd3bb6b7528e2a9235ff84827294e)
           dotnet#15 pc 0000000000100fce  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+30) (BuildId: a08a19770d6696739c847e29c3f5f650)
           dotnet#16 pc 0000000000098fe7  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+55) (BuildId: a08a19770d6696739c847e29c3f5f650)
grendello added a commit that referenced this pull request Jan 18, 2023
BackgroundProcessManager will probably go away, too clunky.
ProcessRunner2 should provide a much better way to do it.
grendello added a commit that referenced this pull request Jan 27, 2023
…otnet#7732)

Fixes: dotnet#7335

Context: d236af5

Commit d236af5 introduced embedded assembly compression, using LZ4,
which speeds up startup and reduces final package size.

Assemblies are compressed at build time and, at the same time, pre-
allocated buffers for the **decompressed** data are allocated in
`libxamarin-app.so`.  The buffers are then passed to the LZ4 APIs,
all threads using the same output buffer.  The assumption was that we
can do fine without locking as even if overlapped decompression
happens, the output data will be the same and so even if two threads
do the same thing at the same time, the data will be valid at all
times, so long as at least one thread completes the decompression.

This assumption proved to be **largely** true, but it appears that in
high concurrency cases it is possible that the data in the
decompression buffer differs.  This can result in app crashes:

	A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 3092 (.NET ThreadPool), pid 2727 (myapp.name)
	A/DEBUG: pid: 2727, tid: 3092, name: .NET ThreadPool  >>> myapp.name <<<
	A/DEBUG:       #1 pc 0000000000029b1c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmono-android.release.so (offset 0x103d000) (xamarin::android::internal::MonodroidRuntime::mono_log_handler(char const*, char const*, char const*, int, void*)+144) (BuildId: 29c5a3805a0bedee1eede9b6668d7c676aa63371)
	A/DEBUG:       #2 pc 00000000002680bc  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       #3 pc 00000000002681e8  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       dotnet#4 pc 000000000008555c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (mono_metadata_string_heap+188) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	…

My guess is that LZ4 either uses the output buffer as a scratchpad
area when decompressing or that it initializes/modifies the buffer
before writing actual data in it.  With overlapped decompression, it
may lead to one thread overwriting valid data previously written by
another thread, so that when the latter returns the buffer it thought
to have had valid data may contain certain bytes temporarily
overwritten by the decompression session in the other, still running,
thread.  It may happen that MonoVM reads the corrupted data just when
it is still invalid (before the still running decompression session
actually writes the valid data), a classic race condition.

To fix this, the decompression block is now protected with a startup-
aware mutex.  Mutex will be held only after the initial startup phase
is completed, so there should not be much loss of startup performance.
grendello added a commit that referenced this pull request Feb 21, 2023
…otnet#7732)

Fixes: dotnet#7335

Context: d236af5

Commit d236af5 introduced embedded assembly compression, using LZ4,
which speeds up startup and reduces final package size.

Assemblies are compressed at build time and, at the same time, pre-
allocated buffers for the **decompressed** data are allocated in
`libxamarin-app.so`.  The buffers are then passed to the LZ4 APIs,
all threads using the same output buffer.  The assumption was that we
can do fine without locking as even if overlapped decompression
happens, the output data will be the same and so even if two threads
do the same thing at the same time, the data will be valid at all
times, so long as at least one thread completes the decompression.

This assumption proved to be **largely** true, but it appears that in
high concurrency cases it is possible that the data in the
decompression buffer differs.  This can result in app crashes:

	A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 3092 (.NET ThreadPool), pid 2727 (myapp.name)
	A/DEBUG: pid: 2727, tid: 3092, name: .NET ThreadPool  >>> myapp.name <<<
	A/DEBUG:       #1 pc 0000000000029b1c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmono-android.release.so (offset 0x103d000) (xamarin::android::internal::MonodroidRuntime::mono_log_handler(char const*, char const*, char const*, int, void*)+144) (BuildId: 29c5a3805a0bedee1eede9b6668d7c676aa63371)
	A/DEBUG:       #2 pc 00000000002680bc  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       #3 pc 00000000002681e8  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       dotnet#4 pc 000000000008555c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (mono_metadata_string_heap+188) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	…

My guess is that LZ4 either uses the output buffer as a scratchpad
area when decompressing or that it initializes/modifies the buffer
before writing actual data in it.  With overlapped decompression, it
may lead to one thread overwriting valid data previously written by
another thread, so that when the latter returns the buffer it thought
to have had valid data may contain certain bytes temporarily
overwritten by the decompression session in the other, still running,
thread.  It may happen that MonoVM reads the corrupted data just when
it is still invalid (before the still running decompression session
actually writes the valid data), a classic race condition.

To fix this, the decompression block is now protected with a startup-
aware mutex.  Mutex will be held only after the initial startup phase
is completed, so there should not be much loss of startup performance.
grendello added a commit that referenced this pull request Jul 13, 2023
)

Context: 929e701
Context: ce2bc68
Context: dotnet#7473
Context: dotnet#8155

The managed linker can produce assemblies optimized for the target
`$(RuntimeIdentifier)` (RID), which means that they will differ
between different RIDs.  Our "favorite" example of this is
`IntPtr.Size`, which is inlined by the linker into `4` or `8` when
targeting 32-bit or 64-bit platforms.  (See also dotnet#7473 and 929e701.)

Another platform difference may come in the shape of CPU intrinsics
which will change the JIT-generated native code in ways that will
crash the application if the assembler instructions generated for the
intrinsics aren't supported by the underlying processor.

In addition, the per-RID assemblies will have different [MVID][0]s
and **may** have different type and method metadata token IDs, which
is important because typemaps *use* type and metadata token IDs; see
also ce2bc68.

All of this taken together invalidates our previous assumption that
all the managed assemblies are identical.  "Simply" using
`IntPtr.Size` in an assembly that contains `Java.Lang.Object`
subclasses will break things.

This in turn could cause "mysterious" behavior or crashes in Release
applications; see also Issue dotnet#8155.

Prevent the potential problems by processing each per-RID assembly
separately and output correct per-RID LLVM IR assembly using the
appropriate per-RID information.

Additionally, during testing I found that for our use of Cecil within
`<GenerateJavaStubs/>` doesn't consistently remove the fields,
delegates, and methods we remove in `MarshalMethodsAssemblyRewriter`
when marshal methods are enabled, or it generates subtly broken
assemblies which cause **some** applications to segfault at run time
like so:

	I monodroid-gc: 1 outstanding GREFs. Performing a full GC!
	F libc    : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x8 in tid 12379 (t6.helloandroid), pid 12379 (t6.helloandroid)
	F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
	F DEBUG   : Build fingerprint: 'google/raven_beta/raven:14/UPB3.230519.014/10284690:user/release-keys'
	F DEBUG   : Revision: 'MP1.0'
	F DEBUG   : ABI: 'arm64'
	F DEBUG   : Timestamp: 2023-07-04 22:09:58.762982002+0200
	F DEBUG   : Process uptime: 1s
	F DEBUG   : Cmdline: com.microsoft.net6.helloandroid
	F DEBUG   : pid: 12379, tid: 12379, name: t6.helloandroid  >>> com.microsoft.net6.helloandroid <<<
	F DEBUG   : uid: 10288
	F DEBUG   : tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
	F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0000000000000008
	F DEBUG   : Cause: null pointer dereference
	F DEBUG   :     x0  0000000000000000  x1  0000007ba1401af0  x2  00000000000000fa  x3  0000000000000001
	F DEBUG   :     x4  0000007ba1401b38  x5  0000007b9f2a8360  x6  0000000000000000  x7  0000000000000000
	F DEBUG   :     x8  ffffffffffc00000  x9  0000007b9f800000  x10 0000000000000000  x11 0000007ba1400000
	F DEBUG   :     x12 0000000000000000  x13 0000007ba374ad58  x14 0000000000000000  x15 00000013ead77d66
	F DEBUG   :     x16 0000007ba372f210  x17 0000007ebdaa4a80  x18 0000007edf612000  x19 000000000000001f
	F DEBUG   :     x20 0000000000000000  x21 0000007b9f2a8320  x22 0000007b9fb02000  x23 0000000000000018
	F DEBUG   :     x24 0000007ba374ad08  x25 0000000000000004  x26 0000007b9f2a4618  x27 0000000000000000
	F DEBUG   :     x28 ffffffffffffffff  x29 0000007fc592a780
	F DEBUG   :     lr  0000007ba3701f44  sp  0000007fc592a730  pc  0000007ba3701e0c  pst 0000000080001000
	F DEBUG   : 8 total frames
	F DEBUG   : backtrace:
	F DEBUG   :       #00 pc 00000000002d4e0c  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       #1 pc 00000000002c29e8  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       #2 pc 00000000002c34bc  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       #3 pc 00000000002c2254  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       dotnet#4 pc 00000000002be0bc  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       dotnet#5 pc 00000000002bf050  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       dotnet#6 pc 00000000002a53a4  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (mono_gc_collect+44) (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       dotnet#7 pc 000000000000513c  <anonymous:7ec716b000>

This is because we generate Java Callable Wrappers over a set of
original (linked or not) assemblies, then we scan them for classes
derived from `Java.Lang.Object` and use that set as input to the
marshal methods rewriter, which makes the changes (generates wrapper
methods, decorates wrapped methods with `[UnmanagedCallersOnly]`,
removes the old delegate methods as well as delegate backing fields)
to all the `Java.Lang.Object` subclasses, then writes the modified
assembly to a `new/<assembly.dll>` location (efa14e2), followed by
copying the newly written assemblies back to the original location.
At this point, we have the results returned by the subclass scanner
in memory and **new** versions of those types on disk, but they are
out of sync, since the types in memory refer to the **old** assemblies,
but AOT is ran on the **new** assemblies which have a different layout,
changed MVIDs and, potentially, different type and method token IDs
(because we added some methods, removed others etc) and thus it causes
the crashes at the run time.  The now invalid set of "old" types is
passed to the typemap generator.  This only worked by accident, because
we (incorrectly) used only the first linked assembly which happened
to be the same one passed to the JLO scanner and AOT - so everything
was fine at the execution time.

Address this by *disabling* LLVM Marshal Methods (8bc7a3e) for .NET 8,
setting `$(AndroidEnableMarshalMethods)`=False by default.
We'll attempt to fix these issues for .NET 9.

[0]: https://learn.microsoft.com/dotnet/api/system.reflection.module.moduleversionid?view=net-7.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants