Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement gcc __attribute__((constructor)) when compiling dynamic libraries under Linux #110102

Closed
aadog opened this issue Nov 23, 2024 · 20 comments

Comments

@aadog
Copy link

aadog commented Nov 23, 2024

Description

How to implement gcc attribute((constructor)) when compiling dynamic libraries under Linux

Reproduction Steps

1.create classlib

2.dotnet publish -r linux-bionic-arm64 -p:DisableUnsupportedError=true -p:PublishAotUsingRuntimePack=true -p:RemoveSections=true

Expected behavior

gcc attribute((constructor))

Actual behavior

no init

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Nov 23, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Nov 23, 2024
@huoyaoyuan huoyaoyuan added area-NativeAOT-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Nov 23, 2024
Copy link
Contributor

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

@am11
Copy link
Member

am11 commented Nov 23, 2024

You can use module initializer for this purpose.

Publish a project with module initializer:

$ dotnet new classlib -n ModInitLib
$ cd ModInitLib
$ cat > Class1.cs <<EOF
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;

public static class LibraryInitializer
{
    [ModuleInitializer]
    public static void Initialize()
    {
        Console.WriteLine("Library has been initialized (ModuleInitializer)!");
    }

    [UnmanagedCallersOnly(EntryPoint = "say_hello")]
    public static void SayHello()
    {
        Console.WriteLine("Hello from ModInitLib!");
    }
}
EOF

$ dotnet publish -p:PublishAot=true -p:TargetName=libModInitLib -o dist --ucr

Link with some executable code:

$ cat > glue.c <<EOF
#include <stdio.h>

void say_hello();

int main(void)
{
    say_hello();
    return 0;
}
EOF

$ cc -o glue glue.c -L$(pwd)/dist -lModInitLib
$ LD_LIBRARY_PATH=$(pwd)/dist ./glue
Library has been initialized (ModuleInitializer)!
Hello from ModInitLib!

@aadog
Copy link
Author

aadog commented Nov 24, 2024

You can use module initializer for this purpose.

Publish a project with module initializer:

$ dotnet new classlib -n ModInitLib
$ cd ModInitLib
$ cat > Class1.cs <<EOF
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;

public static class LibraryInitializer
{
[ModuleInitializer]
public static void Initialize()
{
Console.WriteLine("Library has been initialized (ModuleInitializer)!");
}

[UnmanagedCallersOnly(EntryPoint = "say_hello")]
public static void SayHello()
{
    Console.WriteLine("Hello from ModInitLib!");
}

}
EOF

$ dotnet publish -p:PublishAot=true -p:TargetName=libModInitLib -o dist --ucr
Link with some executable code:

$ cat > glue.c <<EOF
#include <stdio.h>

void say_hello();

int main(void)
{
say_hello();
return 0;
}
EOF

$ cc -o glue glue.c -L./dist -lModInitLib
$ ./glue
Library has been initialized (ModuleInitializer)!
Hello from ModInitLib!

1.First of all thank you for your answer
2.But after my testing, ModuleInitializer does not seem to support shared libraries. I need to compile a binary android so file and inject it. What should I do before other codes are loaded?

@am11
Copy link
Member

am11 commented Nov 24, 2024

ModuleInitializer does not seem to support shared libraries.

The example I showed is the shared library. dotnet new classlib -n ModInitLib .. dotnet publish -p:PublishAot=true -p:TargetName=libModInitLib -o dist --ucr which builds dist/libModInitLib.so.

What should I do before other codes are loaded?

ModuleInitializer will run before any other code from shared lib (libModInitLib.so in the example) is executed.

@aadog
Copy link
Author

aadog commented Nov 24, 2024

ModuleInitializer 似乎不支持共享库。

我展示的例子是共享库dotnet new classlib -n ModInitLib...dotnet publish -p:PublishAot=true -p:TargetName=libModInitLib -o dist --ucr它构建了dist/libModInitLib.so

在加载其他代码之前我应该​​做什么?

ModuleInitializer 将在任何其他代码之前运行来自共享库(示例中为libModInitLib.so)被执行。

It seems to have no effect under android

dotnet publish -r linux-bionic-arm64 -p:DisableUnsupportedError=true -p:PublishAotUsingRuntimePack=true --ucr

@aadog
Copy link
Author

aadog commented Nov 24, 2024

@am11 Sorry, after testing I found that it works,But it seems that he will only execute when a certain function is called, if we just dlopen the function, he won't do anything

@aadog
Copy link
Author

aadog commented Nov 24, 2024

How do we execute in gcc's .init_array

@jkotas
Copy link
Member

jkotas commented Nov 24, 2024

This was discussed and answered in #104529

@MichalStrehovsky
Copy link
Member

Running managed code as part of gcc's .init_array is unsupported. We cannot guarantee that the runtime will be properly initialized to execute managed code. This class of issues is called "static initialization order fiasco". Even if this may appear to work, servicing updates to the .NET SDK can cause this to stop working. It would not be considered a .NET bug if that happens and we will not fix such bugs.

@MichalStrehovsky MichalStrehovsky closed this as not planned Won't fix, can't repro, duplicate, stale Nov 25, 2024
@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label Nov 25, 2024
@aadog
Copy link
Author

aadog commented Nov 25, 2024

这个问题在#104529中讨论过,并得到了回答。

This was discussed and answered in #104529

There is no clear answer in this question, he seems to be crashing the runtime library when we link using gcc .o

@aadog
Copy link
Author

aadog commented Nov 25, 2024

Running managed code as part of gcc's .init_array is unsupported. We cannot guarantee that the runtime will be properly initialized to execute managed code. This class of issues is called "static initialization order fiasco". Even if this may appear to work, servicing updates to the .NET SDK can cause this to stop working. It would not be considered a .NET bug if that happens and we will not fix such bugs.

We often need this kind of problem when working in reverse, maybe there is another way we can do it? For example: actively call runtime init? , what should I do

#include <stdlib.h> extern void so_main(); __attribute__((destructor)) void init(void) { FILE* f = fopen("/data/data/com.whatsapp/cache/xx.txt", "w+"); fclose(f); so_main(); }

When the program runs to so_main, it will crash. It seems that the runtime library has not been initialized.

@aadog
Copy link
Author

aadog commented Nov 25, 2024

This was discussed and answered in #104529

After my testing, it cannot work. Calling the export amplification of c# in c does not seem to initialize the runtime library when statically linking and compiling to .o.

And I manually called the following functions, which means that none of them work.

InitializeRuntime
GLOBAL__sub_I_gcwks_cpp
GLOBAL__sub_I_main_cpp

what should i do?

@am11
Copy link
Member

am11 commented Nov 25, 2024

If you need to run some unrelated code at startup, you can do so by manually adding the attribute constructor code as an object file like this:

<ItemGroup>
  <NativeLibrary Include="initializer.o" />
</ItemGroup>

The object file (initializer.o) can be built using:

cc -c initializer.c -o initializer.o

However, invoking a managed function from the attribute constructor is problematic because managed methods require the runtime to be initialized. Manually running the full initialization routine (e.g., FinalizerStart, __managed_Startup, etc.) is error-prone and defeats the purpose of using .init_array.

It would be more efficient if the runtime could provide a hook for such cases, perhaps through an attribute like [ModuleInitializer(LibraryConstructor = true)]. This could be invoked in a function decorated with __attribute__((constructor(9999))) in nativeaot/Bootstrap/main.cpp, which would call InitializeRuntime() before invoking the managed module initializer with LibraryConstructor=true. However, there may be potential issues with this approach, and @MichalStrehovsky might have more insights on the potential consequences.

@aadog
Copy link
Author

aadog commented Nov 25, 2024

If you need to run some unrelated code at startup, you can do so by manually adding the attribute constructor code as an object file like this:

The object file (`initializer.o`) can be built using:

cc -c initializer.c -o initializer.o
However, invoking a managed function from the attribute constructor is problematic because managed methods require the runtime to be initialized. Manually running the full initialization routine (e.g., FinalizerStart, __managed_Startup, etc.) is error-prone and defeats the purpose of using .init_array.

It would be more efficient if the runtime could provide a hook for such cases, perhaps through an attribute like [ModuleInitializer(LibraryConstructor = true)]. This could be invoked in a function decorated with __attribute__((constructor(9999))) in nativeaot/Bootstrap/main.cpp, which would call InitializeRuntime() before invoking the managed module initializer with LibraryConstructor=true. However, there may be potential issues with this approach, and @MichalStrehovsky might have more insights on the potential consequences.

How should I initialize manually,
I tried calling GLOBAL__sub_I_main_cpp. These functions still seem to be stuck,

@jkotas
Copy link
Member

jkotas commented Nov 25, 2024

It would be more efficient if the runtime could provide a hook for such cases, perhaps through an attribute like [ModuleInitializer(LibraryConstructor = true)].

We are not interested in building unreliable features in .NET (#104529 (reply in thread)).

@am11
Copy link
Member

am11 commented Nov 25, 2024

I tried calling GLOBAL__sub_I_main_cpp. These functions still seem to be stuck,

ld invokes .init_array functions before entrypoint is called. Calling entrypoint in one of the functions pointed by .init_array is a recipe of a recursive disaster. I don't think there is much internal detail exposed (via symbols) which we can reliably use from the outside to perform the right initialization sequence. Even if you figure it out, it's a matter of time before it will break due to some internal refactoring, aggressive inlining with newer version of llvm/gcc, aggressive symbol stripping etc.. It will have to be inside the runtime to load and bookkeep enough things to be able run module initializer etc. and bridge when the real entrypoint is called by the platform loader.

will only execute when a certain function is called

Perhaps reconsider the requirement and try not to rely on the attribute constructor behavior.

@aadog
Copy link
Author

aadog commented Nov 25, 2024

如果运行时可以为这种情况提供一个钩子,比如通过像 [ModuleInitializer(LibraryConstructor = true)] 这样的属性,那么效率会更高。

我们对在 .NET 中构建不可靠的功能不感兴趣(#104529(线程回复))。

Maybe we can provide an intermediate method for dotnet, i.e. manually initialize the run timer,Provide a function that can manually initialize the runtimer in c, and then provide a property [ModuleInitializer(LibraryConstructor = false)]

init_array is the basis for reverse engineering. I think .net will definitely shine in the field of reverse engineering. Otherwise, I think a large number of reverse engineers will work on dotnet.

@aadog
Copy link
Author

aadog commented Nov 25, 2024

I tried calling GLOBAL__sub_I_main_cpp. These functions still seem to be stuck,

ld invokes .init_array functions before entrypoint is called. Calling entrypoint in one of the functions pointed by .init_array is a recipe of a recursive disaster. I don't think there is much internal detail exposed (via symbols) which we can reliably use from the outside to perform the right initialization sequence. Even if you figure it out, it's a matter of time before it will break due to some internal refactoring, aggressive inlining with newer version of llvm/gcc, aggressive symbol stripping etc.. It will have to be inside the runtime to load and bookkeep enough things to be able run module initializer etc. and bridge when the real entrypoint is called by the platform loader.

will only execute when a certain function is called

Perhaps reconsider the requirement and try not to rely on the attribute constructor behavior.

No, it is the basis of reverse engineering. Without init_array, we cannot work. This is what most reverse engineers hope for.

@aadog
Copy link
Author

aadog commented Nov 25, 2024

I now use a C so program to guide it, then it can work normally, but I think this is very unnecessary. We can directly manually connect .o and .lib at some point to manually initialize the runtime library. Can anyone give me some advice? I have some guidance

@aadog
Copy link
Author

aadog commented Nov 25, 2024

init_array is the main battlefield for anti-debugging, I think we should deal with it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

5 participants