Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typemaps per RID support and disable marshal methods by default #8164

Merged
merged 16 commits into from
Jul 13, 2023

Conversation

grendello
Copy link
Contributor

@grendello grendello commented Jul 3, 2023

Fix handling of per-RID assemblies produced by the linker when generating typemaps.

The managed linker can produce assemblies optimized for the target RID, which will
make them differ between different RIDs. E.g. all the IntPtr.Size references are
inlined so that 32-bit platforms have the 4 integer constant in its place, while
64-bit RIDs will get 8 in the same spot. Another platform difference may come in
the shape of CPU intrinsics which will change the JIT-generated native code in ways
that will crash the application if the assembler instructions generated for the
intrinsics aren't supported by the underlying processor.

In addition, the per-RID assemblies will have different MVIDs and may have different
type and method metadata token IDs.

All of this taken together invalidates our previous assumption that all the managed
assemblies are identical and make cause "mysterious" crashes in Release applications.

Prevent the potential problems by processing each per-RID assembly separately and
output correct per-RID LLVM IR assembly using that information.

Additionally, during testing I found that for some reason Cecil either doesn't remove
fields, delegates and methods we remove in the MarshalMethodsAssemblyRewriter class
when marshal methods are enabled, or it generates subtly broken assemblies which cause
some applications to segfault at run time like so:

07-04 22:09:58.734 12379 12379 I monodroid-gc: 1 outstanding GREFs. Performing a full GC!
07-04 22:09:58.735 12379 12379 F libc    : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x8 in tid 12379 (t6.helloandroid), pid 12379 (t6.helloandroid)
07-04 22:09:58.839 12401 12401 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
07-04 22:09:58.839 12401 12401 F DEBUG   : Build fingerprint: 'google/raven_beta/raven:14/UPB3.230519.014/10284690:user/release-keys'
07-04 22:09:58.839 12401 12401 F DEBUG   : Revision: 'MP1.0'
07-04 22:09:58.839 12401 12401 F DEBUG   : ABI: 'arm64'
07-04 22:09:58.839 12401 12401 F DEBUG   : Timestamp: 2023-07-04 22:09:58.762982002+0200
07-04 22:09:58.839 12401 12401 F DEBUG   : Process uptime: 1s
07-04 22:09:58.839 12401 12401 F DEBUG   : Cmdline: com.microsoft.net6.helloandroid
07-04 22:09:58.839 12401 12401 F DEBUG   : pid: 12379, tid: 12379, name: t6.helloandroid  >>> com.microsoft.net6.helloandroid <<<
07-04 22:09:58.839 12401 12401 F DEBUG   : uid: 10288
07-04 22:09:58.839 12401 12401 F DEBUG   : tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
07-04 22:09:58.839 12401 12401 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0000000000000008
07-04 22:09:58.839 12401 12401 F DEBUG   : Cause: null pointer dereference
07-04 22:09:58.839 12401 12401 F DEBUG   :     x0  0000000000000000  x1  0000007ba1401af0  x2  00000000000000fa  x3  0000000000000001
07-04 22:09:58.839 12401 12401 F DEBUG   :     x4  0000007ba1401b38  x5  0000007b9f2a8360  x6  0000000000000000  x7  0000000000000000
07-04 22:09:58.839 12401 12401 F DEBUG   :     x8  ffffffffffc00000  x9  0000007b9f800000  x10 0000000000000000  x11 0000007ba1400000
07-04 22:09:58.839 12401 12401 F DEBUG   :     x12 0000000000000000  x13 0000007ba374ad58  x14 0000000000000000  x15 00000013ead77d66
07-04 22:09:58.839 12401 12401 F DEBUG   :     x16 0000007ba372f210  x17 0000007ebdaa4a80  x18 0000007edf612000  x19 000000000000001f
07-04 22:09:58.839 12401 12401 F DEBUG   :     x20 0000000000000000  x21 0000007b9f2a8320  x22 0000007b9fb02000  x23 0000000000000018
07-04 22:09:58.839 12401 12401 F DEBUG   :     x24 0000007ba374ad08  x25 0000000000000004  x26 0000007b9f2a4618  x27 0000000000000000
07-04 22:09:58.839 12401 12401 F DEBUG   :     x28 ffffffffffffffff  x29 0000007fc592a780
07-04 22:09:58.839 12401 12401 F DEBUG   :     lr  0000007ba3701f44  sp  0000007fc592a730  pc  0000007ba3701e0c  pst 0000000080001000
07-04 22:09:58.839 12401 12401 F DEBUG   : 8 total frames
07-04 22:09:58.839 12401 12401 F DEBUG   : backtrace:
07-04 22:09:58.839 12401 12401 F DEBUG   :       #00 pc 00000000002d4e0c  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
07-04 22:09:58.839 12401 12401 F DEBUG   :       #01 pc 00000000002c29e8  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
07-04 22:09:58.839 12401 12401 F DEBUG   :       #02 pc 00000000002c34bc  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
07-04 22:09:58.839 12401 12401 F DEBUG   :       #03 pc 00000000002c2254  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
07-04 22:09:58.839 12401 12401 F DEBUG   :       #04 pc 00000000002be0bc  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
07-04 22:09:58.839 12401 12401 F DEBUG   :       #05 pc 00000000002bf050  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
07-04 22:09:58.839 12401 12401 F DEBUG   :       #06 pc 00000000002a53a4  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (mono_gc_collect+44) (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
07-04 22:09:58.839 12401 12401 F DEBUG   :       #07 pc 000000000000513c  <anonymous:7ec716b000>

This is a result of using a Cecil-rewritten assembly which differed to the original
version by just 4kb and it appeared to have removed some callbacks and their
backing fields marked for removal by the MarshalMethodsAssemblyRewriter class and
move others around the assembly instead of removing them. The resulting difference
was enough to crash the application when profiled AOT and marshal methods were
used. This suggests that I need to spend more time investigating the issue and I'm
certain I won't be able to complete this before NET8 is released and, thus, disabling
of marshal methods by default is the only sensible course of action.

@grendello grendello changed the title Typemap and marshal methods per RID support Typemap per RID support Jul 5, 2023
@grendello grendello changed the title Typemap per RID support Typemaps per RID support Jul 5, 2023
@grendello grendello marked this pull request as ready for review July 5, 2023 11:04
@grendello grendello requested a review from jonpryor July 5, 2023 11:04
@grendello grendello changed the title Typemaps per RID support Typemaps per RID support and disable marshal methods by default Jul 5, 2023
@jonpryor
Copy link
Member

jonpryor commented Jul 5, 2023

This will implicitly fix #8155 by virtue of disabling marshal methods. Will it explicitly fix it as well, e.g. when we eventually re-enable marshal methods will #8155 no longer be an issue? (At a glance, the changes to <GenerateJavaStubs/> suggest this may explicitly fix #8155, but I'm not certain.)

@jonpryor
Copy link
Member

jonpryor commented Jul 5, 2023

it appeared to have removed some callbacks and their backing fields marked for removal by the MarshalMethodsAssemblyRewriter class and move others around the assembly instead of removing them.

Can you provide an example of this behavior? What was removed, what was moved? (What does it even mean to "move" a field? The declaring type changed?)

@grendello
Copy link
Contributor Author

This will implicitly fix #8155 by virtue of disabling marshal methods. Will it explicitly fix it as well, e.g. when we eventually re-enable marshal methods will #8155 no longer be an issue? (At a glance, the changes to <GenerateJavaStubs/> suggest this may explicitly fix #8155, but I'm not certain.)

It won't explicitly fix #8155, since marshal methods generator needs to be changed to account for per-RID assemblies.

@grendello
Copy link
Contributor Author

grendello commented Jul 5, 2023

Mono.Android-mm-rewrite.zip

it appeared to have removed some callbacks and their backing fields marked for removal by the MarshalMethodsAssemblyRewriter class and move others around the assembly instead of removing them.

Can you provide an example of this behavior? What was removed, what was moved? (What does it even mean to "move" a field? The declaring type changed?)

I can, see the attached zip for diffs (can't attach the IL or the original assemblies since the zip, or the files individually, are too big for GitHub). I also know what's likely causing it, but the fix isn't quick nor "safe" (with regards to time we have left to test it all). The problem is as follows: we generate JCWs over a set of original (linked or not) assemblies, then we move to scan them for classes derived from Java.Lang.Object and use that set as input to the marshal methods rewriter, which makes the changes (generates wrapper methods, decorates wrapped methods with [UnmanagedCallersOnly], removes the old delegate methods as well as delegate backing fields) to all the JLO types, then writes the modified assembly to a new/ location, followed by copying the newly written assemblies back to the original location. At this point, we have the results returned by the JLO scanner in memory and new versions of those types on disk - but they are out of sync, since the types in memory refer to the old assemblies, but AOT is ran on the new assemblies which have a different layout, changed MVIDs and, potentially, different type and method token IDs (because we added some methods, removed others etc) and thus it causes the crashes at the run time. The now invalid set of "old" types is passed to the typemap generator. In main it worked by accident, because we (incorrectly) used only the first linked assembly which happened to be the same one passed to the JLO scanner and AOT - so everything was fine at the execution time.

The fix is deceptively simple - we "just" need to reload all the types found by the scanner from the new, rewritten, assemblies and regenerate the list of JLO types to pass to both the marshal methods generator and the typemap generator. This will require more than a few changes to pretty crucial portions of code and I'm not comfortable doing that this late before NET8 release, because we have no way to see if we rewrote and packaged the per-RID assemblies correctly, because the runtime will happily load any of them on any platform, with a random and cryptic crash being the only sign that something went wrong. In the entire process, we rely only upon MSBuild item metadata to determine whether or not a given assembly is RID-specific, metadata which can be (purposefully or not) modified by any code in the multitude of .target and .csproj files involved in building a Xamarin.Android app.

grendello added 4 commits July 6, 2023 11:34
* main:
  [xaprepare] update Debian dependencies for current unstable (trixie) (dotnet#8169)
* main:
  Bump to dotnet/installer@28d4a6b4be 8.0.100-preview.7.23330.16 (dotnet#8162)
  [Xamarin.Android.Build.Tasks] Move MonoAndroidAssetsDirIntermediate (dotnet#8166)
{
Type = type;
if (perAbiTypes != null) {
PerAbiTypes = new ReadOnlyDictionary<AndroidTargetArch, TypeDefinition> (perAbiTypes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create a new ReadOnlyDictionary wrapper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defensive coding - I don't want that data set changed by accident or otherwise

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will prevent people from modifying javaType.PerAbiTypes, but that won't defend against people modifying the underlying collection in perAbiTypes. This new ReadOnlyDictionary doesn't "deep copy" perAbiTypes, it shares the reference. Other code could thus modify perAbiTypes (if they have an appropriate reference) to add/remove items.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need "simple" protection against adding/removing entries to this dictionary. ReadOnlyDictionary gives that, it's good enough.

class JavaType
{
public readonly TypeDefinition Type;
public readonly IDictionary<AndroidTargetArch, TypeDefinition>? PerAbiTypes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…and if you are going to create a ReadOnlyDictionary wrapper, why isn't this typed as ReadOnlyDictionary? Why keep this as IDictionary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because one day I can change my mind and use a SortedReadOnlyDictionary? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because one day I can change my mind and use a SortedReadOnlyDictionary? :-)

While a joke, that still doesn't make sense to me, because ReadOnlyDictionary doesn't "deep copy" the contents, it just wraps the modifying methods to throw NotSupportedException:

A SortedReadOnlyDictionary would make no sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't protect the TypeDefinition (and don't want to), because it will be modified - I care about the coupling of architecture and the type definition. That's what is to be read only, and ReadOnlyDictionary provides that invariant. While SortedReadOnlyDictionary was a joke, the reason IDictionary is used is because I believe that's the best course of action in somewhat future-proof ("somewhat" because the field could in theory change type to ICollection<SomethingDescribingAPerAbiTypeDefinition>) API design. Users of JavaType don't need to know what concrete IDictionary implementation is used.


class JavaType
{
public readonly TypeDefinition Type;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type and PerAbiTypes confuses me (likely because I haven't fully read this PR yet); what assembly does Type come from? Why does Type exist at all?

It feels like JavaType should "really" be an "enum class" (something C# should borrow from Swift): it's either Type, or it's PerAbiTypes. Both at the same time? How's that work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type is just a convenient way to access a type for code that doesn't care about per-rid differences. The assumption is that all the per-RID types are identical as far as their "identity" is concerned (i.e. that Some.Namespace.Klass is the same in any of the per-RID assemblies, conceptually and API-wise) and some code (like JCWs) needs only that "identity" information, the type name, methods etc. Other code, like typemaps or marshal methods, need the metadata info which can differ between RIDs (MVIDs, type and method tokens for instance). So Type is any (usually the first added) TypeDefinition to the PerAbiTypes dictionary, making it easier for JCW to just use what it needs without messing with dictionaries.

Type = type;
if (perAbiTypes != null) {
PerAbiTypes = new ReadOnlyDictionary<AndroidTargetArch, TypeDefinition> (perAbiTypes);
IsABiSpecific = perAbiTypes.Count > 1 || (perAbiTypes.Count == 1 && !perAbiTypes.ContainsKey (AndroidTargetArch.None));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an AndroidTargetArch.None value, perhaps Type should instead be a property?

partial class JavaType {
    public TypeDefinition AbiNeutralType => PerAbiTypes.TryGetValue (AndroidTargetArch.None, out var t) ? t : throw new NotSupportedException ();
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existence of AndroidTargetArch.None implies that it could be possible for perAbiTypes to have .None + other ABIs, which feels like it should be not permitted.

Copy link
Contributor Author

@grendello grendello Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AbiNeutralType would have to always be valid, no exception thrown - it MUST be valid no matter what is in PerAbiTypes. Type described above saves one from having to write weird code to get any value from the PerAbiTypes like:

TypeDefinition anyType;
foreach (var kvp in PerAbiTypes) {
   anyType = kvp.Value;
   break;
}

Which would be necessary if Type didn't exist. As the code is implemented now, the readonly Type field is simply set whenever we add the first, any, TypeDefinition to the dictionary and that solves the problem in an elegant way.


AndroidTargetArch GetTargetArch (ITaskItem asmItem)
{
string? abi = asmItem.GetMetadata ("Abi");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably "sniff the path" in addition to using the %(Abi) metadata?

Copy link
Contributor Author

@grendello grendello Jul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, Abi is added based on the path elsewhere in our targets. The metadata was added specifically for this purpose sometime ago. It's a kludge, granted, but we need something to work around the lack of information in the assembly as to which RID it belongs.

var types = new Dictionary<string, TypeData> (StringComparer.Ordinal);
foreach (ITaskItem asmItem in inputAssemblies) {
AndroidTargetArch arch = GetTargetArch (asmItem);
AssemblyDefinition asmdef = resolver.GetAssembly (asmItem.ItemSpec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect/fear that this doesn't do what you think (hope?) it does:

In particular, note lines 254 and 257, which uses the "assembly name" -- which is based off the the filename without extension -- as the key into DirectoryAssemblyResolver.cache.

TL;DR: this will return the first assembly encountered, regardless of ABI, always. arch (previous line)` is not relevant.

DirectoryAssemblyResolver will not help you here, unless (maybe?) you have per-ABI DirectoryAssemblyResolver instances?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous code took 30s to load an assembly, so I switched to this. It seems the next step is to add a new resolver. Next week, then.

@grendello grendello added the do-not-merge PR should not be merged. label Jul 7, 2023
@grendello grendello removed the do-not-merge PR should not be merged. label Jul 10, 2023
@@ -1326,10 +1326,10 @@ public void Dispose ()
using (var builder = CreateApkBuilder (Path.Combine ("temp", TestContext.CurrentContext.Test.Name))) {
builder.ThrowOnBuildFailure = false;
Assert.IsFalse (builder.Build (proj), "Build should have failed with XA4212.");
StringAssertEx.Contains ($"error XA4", builder.LastBuildOutput, "Error should be XA4212");
StringAssertEx.Contains ($"error : XA4", builder.LastBuildOutput, "Error should be XA4212");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just wondering how this PR caused : to be inserted. This doesn't make sense to me. (Yet appears necessary, considering that the PR is green?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It calls Log.LogError from TaskLoggingHelper which prepends error : to the message. The old code used JI's custom logger which didn't use that format.

this.methods = methods ?? throw new ArgumentNullException (nameof (methods));
this.uniqueAssemblies = uniqueAssemblies ?? throw new ArgumentNullException (nameof (uniqueAssemblies));
this.assemblyPaths = assemblyPaths ?? throw new ArgumentNullException (nameof (assemblyPaths));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're saying that assemblyPaths can be null now? Or are we relying on nullable reference types?

I'm confused; as far as I can tell, $(Nullable) is not set in Xamarin.Android.Build.Tasks, yet other lines in this file use nullable reference types. How is that working?

(Other files in this project use #nullable enable, but not this file, so how is it using nullable reference types?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't pay attention to MarshalMethodsAssemblyRewriter in this PR, it's going to change anyway later.


static string? SearchDirectory (string name, string directory)
{
if (Path.IsPathRooted (name) && File.Exists (name)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that can happen/we should care about? How is AssemblyNameReference.Name going to contain a rooted path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the code is copied from the JI DirectoryAssemblyResolver, I kept portions of code unmodified, assuming they were there for a reason and to keep the two resolvers as close as possible in terms of behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it doesn't hurt to handle this case. Today, we get relative paths but it's external input, passed to the task by MSBuild, so it can very well change to absolute paths one day.

foreach (string dir in directories) {
if ((assemblyFile = SearchDirectory (name.Name, dir)) != null) {
AssemblyDefinition? loaded = Load (arch, assemblyFile);
if (Array.Equals (loaded?.Name.MetadataToken, name.MetadataToken)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While DirectoryAssemblyResolver has this code, I'm not certain that we actually want this anymore? We have this "bizarre" internal bug in which System.Runtime.dll v6.0 can't be resolved, and while investigating that I found that dotnet/linker doesn't requires that .MetadataToken match. There is thus a case that we shouldn't mimic DirectoryAssemblyResolver and should instead mimic AssemblyResolver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept most of the code around because it has served us well over the years, meaning it was fine. However, if you think this code isn't needed anymore, I can remove it, sure.


if (!cache.TryGetValue (name, out entry)) {
entry = new CacheEntry (log, filePath, assembly, arch);
cache.Add (name, entry);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When forceLoad:true, we'll possibly be calling cache.Add(name, entry) multiple times, possibly for the same ABI, which will elicit a warning message. This feels undesirable, but relatedly, I can't find any codepaths which call Load(…, forceLoad:true).

Why support bool forceLoad at all if nothing is using it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache.Add (name, entry) is protected by TryGetValue and shouldn't throw. Kept forceLoad for compatibility, "just in case".

return FindAndLoadFromDirectories (arch, directories, name, parameters);
}

AssemblyDefinition FindAndLoadFromDirectories (AndroidTargetArch arch, ICollection<string> directories, AssemblyNameReference name, ReaderParameters? parameters)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have ReaderParameters? parameters if you're not going to use it? Shouldn't the call to Load(arch, assemblyFile) provide parameters?

}

try {
assembly = ReadAssembly (filePath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to ReaderParameters? parametesr, shouldn't ReadAssembly() take the parameters value as well?

AssemblyDefinition ReadAssembly (string filePath)
{
bool haveDebugSymbols = loadDebugSymbols && File.Exists (Path.ChangeExtension (filePath, ".pdb"));
var loadReaderParams = new ReaderParameters () {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a this.readerParameters field, yet it isn't used here.

This method should probably instead be:

AssemblyDefinition ReadAssembly (string filePath, ReaderParameters? parameters)
{
    var loadReaderParams = parameters ?? this.readerParameters;
    // …
}

The constructor should likewise be updated so that instead of doing new ReaderParameters(), it has this expression.

Copy link
Contributor Author

@grendello grendello Jul 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

} catch (Exception ex) {
log.LogWarning ($"Failed to read '{filePath}' with debugging symbols. Retrying to load it without it. Error details are logged below.");
log.LogWarning ($"{ex.ToString ()}");
loadReaderParams.ReadSymbols = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though wrt my previous comment about using parameters ?? this.readerParameters, this presents a thread-safety concern (shared data, fun!). (Even though you can't use shared Cecil across threads, so this shouldn't be possible, I still look at it with side-eyes.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean? loadReaderParams is a local variable, a new instance of ReaderParameters created on each call to ReadAssembly and ReadSymbols is a bool property, I don't think this is thread-unsafe.

}

if (arch == AndroidTargetArch.None) {
// Disabled for now, generates too much noise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's "concerning" if this is generating too much noise. I'm not sure what that means, but it's concerning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Cecil has no idea about architectures and will call us on Resolve overloads which don't specify the architecture, so we use None and that causes this warning. It won't be an issue when we eventually have one resolver per RID.

}

if (!entry.Assemblies.TryGetValue (arch, out AssemblyDefinition? asm)) {
if (loading) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the loading parameter should be renamed for clarity, because I don't understand why loading:true would result in no assembly being returned. Additionally, loading:true is used by "force load" path, which doesn't appear to be used, so we have additional unused codepaths.

I don't think we needed bool loading at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have loading then for every per-RID assembly other than the first one we'll get the "assembly not found" exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code:

	public virtual AssemblyDefinition? Load (AndroidTargetArch arch, string filePath, ReaderParameters? readerParameters = null)
	{
		string name = Path.GetFileNameWithoutExtension (filePath);
		AssemblyDefinition? assembly;
		if (cache.TryGetValue (name, out CacheEntry? entry)) {
			assembly = SelectAssembly (arch, name, entry, loading: true);
			if (assembly != null) {
				return assembly;
			}
		}

will be called for every per-RID assembly, with the same name but different path/RID. If loading is absent, SelectAssembly will throw because it won't find an entry for arch


return abi switch {
"armeabi-v7a" => AndroidTargetArch.Arm,
"arm64-v8a" => AndroidTargetArch.Arm64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation here is inconsistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…and shouldn't this use AbiToArch(), elsewhere? This expression is duplicated.

* main:
  [AndroidDependenciesTests] Use platform-tools 34.0.4 (dotnet#8184)
@jonpryor
Copy link
Member

[Xamarin.Android.Build.Tasks] per-RID assemblies & typemaps (#8164)

Context: 929e7012410233e6814af369db582f238ba185ad
Context: ce2bc689a19cb45f7d7bfdd371c16c54b018a020
Context: https://github.com/xamarin/xamarin-android/issues/7473
Context: https://github.com/xamarin/xamarin-android/issues/8155

The managed linker can produce assemblies optimized for the target
`$(RuntimeIdentifier)` (RID), which means that they will differ
between different RIDs.  Our "favorite" example of this is
`IntPtr.Size`, which is inlined by the linker into `4` or `8` when
targeting 32-bit or 64-bit platforms.  (See also #7473 and 929e7012.)

Another platform difference may come in the shape of CPU intrinsics
which will change the JIT-generated native code in ways that will
crash the application if the assembler instructions generated for the
intrinsics aren't supported by the underlying processor.

In addition, the per-RID assemblies will have different [MVID][0]s
and **may** have different type and method metadata token IDs, which
is important because typemaps *use* type and metadata token IDs; see
also ce2bc689.

All of this taken together invalidates our previous assumption that
all the managed assemblies are identical.  "Simply" using
`IntPtr.Size` in an assembly that contains `Java.Lang.Object`
subclasses will break things.

This in turn could cause "mysterious" behavior or crashes in Release
applications; see also Issue #8155.

Prevent the potential problems by processing each per-RID assembly
separately and output correct per-RID LLVM IR assembly using the
appropriate per-RID information.

Additionally, during testing I found that for our use of Cecil within
`<GenerateJavaStubs/>` doesn't consistently remove the fields,
delegates, and methods we remove in `MarshalMethodsAssemblyRewriter`
when marshal methods are enabled, or it generates subtly broken
assemblies which cause **some** applications to segfault at run time
like so:

	I monodroid-gc: 1 outstanding GREFs. Performing a full GC!
	F libc    : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x8 in tid 12379 (t6.helloandroid), pid 12379 (t6.helloandroid)
	F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
	F DEBUG   : Build fingerprint: 'google/raven_beta/raven:14/UPB3.230519.014/10284690:user/release-keys'
	F DEBUG   : Revision: 'MP1.0'
	F DEBUG   : ABI: 'arm64'
	F DEBUG   : Timestamp: 2023-07-04 22:09:58.762982002+0200
	F DEBUG   : Process uptime: 1s
	F DEBUG   : Cmdline: com.microsoft.net6.helloandroid
	F DEBUG   : pid: 12379, tid: 12379, name: t6.helloandroid  >>> com.microsoft.net6.helloandroid <<<
	F DEBUG   : uid: 10288
	F DEBUG   : tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
	F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0000000000000008
	F DEBUG   : Cause: null pointer dereference
	F DEBUG   :     x0  0000000000000000  x1  0000007ba1401af0  x2  00000000000000fa  x3  0000000000000001
	F DEBUG   :     x4  0000007ba1401b38  x5  0000007b9f2a8360  x6  0000000000000000  x7  0000000000000000
	F DEBUG   :     x8  ffffffffffc00000  x9  0000007b9f800000  x10 0000000000000000  x11 0000007ba1400000
	F DEBUG   :     x12 0000000000000000  x13 0000007ba374ad58  x14 0000000000000000  x15 00000013ead77d66
	F DEBUG   :     x16 0000007ba372f210  x17 0000007ebdaa4a80  x18 0000007edf612000  x19 000000000000001f
	F DEBUG   :     x20 0000000000000000  x21 0000007b9f2a8320  x22 0000007b9fb02000  x23 0000000000000018
	F DEBUG   :     x24 0000007ba374ad08  x25 0000000000000004  x26 0000007b9f2a4618  x27 0000000000000000
	F DEBUG   :     x28 ffffffffffffffff  x29 0000007fc592a780
	F DEBUG   :     lr  0000007ba3701f44  sp  0000007fc592a730  pc  0000007ba3701e0c  pst 0000000080001000
	F DEBUG   : 8 total frames
	F DEBUG   : backtrace:
	F DEBUG   :       #00 pc 00000000002d4e0c  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       #01 pc 00000000002c29e8  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       #02 pc 00000000002c34bc  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       #03 pc 00000000002c2254  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       #04 pc 00000000002be0bc  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       #05 pc 00000000002bf050  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       #06 pc 00000000002a53a4  /data/app/~~Av24J15xbf20XdrY3X2_wA==/com.microsoft.net6.helloandroid-4DusuNWIAkz1Ssi7fWVF-g==/lib/arm64/libmonosgen-2.0.so (mono_gc_collect+44) (BuildId: 761134f2369377582cc3a8e25ecccb43a2e0a877)
	F DEBUG   :       #07 pc 000000000000513c  <anonymous:7ec716b000>

This is because we generate Java Callable Wrappers over a set of
original (linked or not) assemblies, then we scan them for classes
derived from `Java.Lang.Object` and use that set as input to the
marshal methods rewriter, which makes the changes (generates wrapper
methods, decorates wrapped methods with `[UnmanagedCallersOnly]`,
removes the old delegate methods as well as delegate backing fields)
to all the `Java.Lang.Object` subclasses, then writes the modified
assembly to a `new/<assembly.dll>` location (efa14e26), followed by
copying the newly written assemblies back to the original location.
At this point, we have the results returned by the subclass scanner
in memory and **new** versions of those types on disk, but they are
out of sync, since the types in memory refer to the **old** assemblies,
but AOT is ran on the **new** assemblies which have a different layout,
changed MVIDs and, potentially, different type and method token IDs
(because we added some methods, removed others etc) and thus it causes
the crashes at the run time.  The now invalid set of "old" types is
passed to the typemap generator.  This only worked by accident, because
we (incorrectly) used only the first linked assembly which happened
to be the same one passed to the JLO scanner and AOT - so everything
was fine at the execution time.

Address this by *disabling* LLVM Marshal Methods (8bc7a3e8) for .NET 8,
setting `$(AndroidEnableMarshalMethods)`=False by default.
We'll attempt to fix these issues for .NET 9.

[0]: https://learn.microsoft.com/dotnet/api/system.reflection.module.moduleversionid?view=net-7.0

@jonpryor jonpryor merged commit 6836818 into dotnet:main Jul 13, 2023
@grendello grendello deleted the typemap-per-rid-support branch July 13, 2023 21:29
grendello added a commit to grendello/xamarin-android that referenced this pull request Jul 13, 2023
* main:
  [Xamarin.Android.Build.Tasks] per-RID assemblies & typemaps (dotnet#8164)
  [AndroidDependenciesTests] Test both manifest types (dotnet#8186)
grendello added a commit to grendello/xamarin-android that referenced this pull request Jul 14, 2023
* main:
  [Xamarin.Android.Build.Tasks] per-RID assemblies & typemaps (dotnet#8164)
  [AndroidDependenciesTests] Test both manifest types (dotnet#8186)
  [AndroidDependenciesTests] Use platform-tools 34.0.4 (dotnet#8184)
grendello added a commit to grendello/xamarin-android that referenced this pull request Jul 14, 2023
* main:
  [ci] XA.PublishAllLogs publishes all build logs to build artifacts (dotnet#8189)
  Bump to xamarin/Java.Interop/main@151b03e (dotnet#8188)
  [Mono.Android] Use PublicApiAnalyzers to ensure we do not break API. (dotnet#8171)
  [Xamarin.Android.Build.Tasks] per-RID assemblies & typemaps (dotnet#8164)
  [AndroidDependenciesTests] Test both manifest types (dotnet#8186)
  [AndroidDependenciesTests] Use platform-tools 34.0.4 (dotnet#8184)
grendello added a commit to grendello/xamarin-android that referenced this pull request Jul 17, 2023
* main:
  LLVM IR code generator refactoring and updates (dotnet#8140)
  [ci] XA.PublishAllLogs publishes all build logs to build artifacts (dotnet#8189)
  Bump to xamarin/Java.Interop/main@151b03e (dotnet#8188)
  [Mono.Android] Use PublicApiAnalyzers to ensure we do not break API. (dotnet#8171)
  [Xamarin.Android.Build.Tasks] per-RID assemblies & typemaps (dotnet#8164)
  [AndroidDependenciesTests] Test both manifest types (dotnet#8186)
  [AndroidDependenciesTests] Use platform-tools 34.0.4 (dotnet#8184)
@github-actions github-actions bot locked and limited conversation to collaborators Jan 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants