Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[monodroid] Prevent overlapped decompression of embedded assemblies #7817

Merged
merged 5 commits into from
Feb 27, 2023

Conversation

grendello
Copy link
Contributor

Fixes: #7335
Context: #7732
Context: d236af5

Commit d236af5 introduced embedded assembly compression, using LZ4,
which speeds up startup and reduces final package size.

Assemblies are compressed at build time and, at the same time, pre-
allocated buffers for the decompressed data are allocated in
libxamarin-app.so. The buffers are then passed to the LZ4 APIs,
all threads using the same output buffer. The assumption was that we
can do fine without locking as even if overlapped decompression
happens, the output data will be the same and so even if two threads
do the same thing at the same time, the data will be valid at all
times, so long as at least one thread completes the decompression.

This assumption proved to be largely true, but it appears that in
high concurrency cases it is possible that the data in the
decompression buffer differs. This can result in app crashes:

    A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 3092 (.NET ThreadPool), pid 2727 (myapp.name)
    A/DEBUG: pid: 2727, tid: 3092, name: .NET ThreadPool  >>> myapp.name <<<
    A/DEBUG:       #01 pc 0000000000029b1c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmono-android.release.so (offset 0x103d000) (xamarin::android::internal::MonodroidRuntime::mono_log_handler(char const*, char const*, char const*, int, void*)+144) (BuildId: 29c5a3805a0bedee1eede9b6668d7c676aa63371)
    A/DEBUG:       #02 pc 00000000002680bc  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
    A/DEBUG:       #03 pc 00000000002681e8  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
    A/DEBUG:       #04 pc 000000000008555c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (mono_metadata_string_heap+188) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
    …

My guess is that LZ4 either uses the output buffer as a scratchpad
area when decompressing or that it initializes/modifies the buffer
before writing actual data in it. With overlapped decompression, it
may lead to one thread overwriting valid data previously written by
another thread, so that when the latter returns the buffer it thought
to have had valid data may contain certain bytes temporarily
overwritten by the decompression session in the other, still running,
thread. It may happen that MonoVM reads the corrupted data just when
it is still invalid (before the still running decompression session
actually writes the valid data), a classic race condition.

To fix this, the decompression block is now protected with a startup-
aware mutex. Mutex will be held only after the initial startup phase
is completed, so there should not be much loss of startup performance.

…otnet#7732)

Fixes: dotnet#7335

Context: d236af5

Commit d236af5 introduced embedded assembly compression, using LZ4,
which speeds up startup and reduces final package size.

Assemblies are compressed at build time and, at the same time, pre-
allocated buffers for the **decompressed** data are allocated in
`libxamarin-app.so`.  The buffers are then passed to the LZ4 APIs,
all threads using the same output buffer.  The assumption was that we
can do fine without locking as even if overlapped decompression
happens, the output data will be the same and so even if two threads
do the same thing at the same time, the data will be valid at all
times, so long as at least one thread completes the decompression.

This assumption proved to be **largely** true, but it appears that in
high concurrency cases it is possible that the data in the
decompression buffer differs.  This can result in app crashes:

	A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 3092 (.NET ThreadPool), pid 2727 (myapp.name)
	A/DEBUG: pid: 2727, tid: 3092, name: .NET ThreadPool  >>> myapp.name <<<
	A/DEBUG:       #1 pc 0000000000029b1c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmono-android.release.so (offset 0x103d000) (xamarin::android::internal::MonodroidRuntime::mono_log_handler(char const*, char const*, char const*, int, void*)+144) (BuildId: 29c5a3805a0bedee1eede9b6668d7c676aa63371)
	A/DEBUG:       #2 pc 00000000002680bc  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       #3 pc 00000000002681e8  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       dotnet#4 pc 000000000008555c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (mono_metadata_string_heap+188) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	…

My guess is that LZ4 either uses the output buffer as a scratchpad
area when decompressing or that it initializes/modifies the buffer
before writing actual data in it.  With overlapped decompression, it
may lead to one thread overwriting valid data previously written by
another thread, so that when the latter returns the buffer it thought
to have had valid data may contain certain bytes temporarily
overwritten by the decompression session in the other, still running,
thread.  It may happen that MonoVM reads the corrupted data just when
it is still invalid (before the still running decompression session
actually writes the valid data), a classic race condition.

To fix this, the decompression block is now protected with a startup-
aware mutex.  Mutex will be held only after the initial startup phase
is completed, so there should not be much loss of startup performance.
@jonpryor jonpryor merged commit 79a0897 into dotnet:release/7.0.1xx Feb 27, 2023
@grendello grendello deleted the 7.0.1xx-backport-pr7732 branch February 27, 2023 19:54
@claudioredi
Copy link

I understand this problem can also affect apps not targetting .net 6 or greater although original ticket was about .net 6. Is this correct?

@grendello
Copy link
Contributor Author

@claudioredi correct

@claudioredi
Copy link

claudioredi commented Apr 13, 2023

Perfect. What's the version including this fix? I'm sorry but I always have hard time understanding when a fix is released and in which version. I cannot correlate 7.0.1 to anything related to Xamarin.Android

@grendello
Copy link
Contributor Author

@claudioredi tbh, I'm as confused as you are with regards to releases :) But I think this fix should be in 13.2

@github-actions github-actions bot locked and limited conversation to collaborators Jan 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants