Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDetour safetyhook #2162

Merged
merged 15 commits into from
May 21, 2024
Merged

CDetour safetyhook #2162

merged 15 commits into from
May 21, 2024

Conversation

Kenzzer
Copy link
Member

@Kenzzer Kenzzer commented May 19, 2024

Replaces libudis86 with safetyhooks, which in turn will make CDetour more robust. This will primarily benefit extension makers for x64 modding, as well as lay down the groundwork to add x64 dynamic detour support to DHooks.

Original Draft PR text :
Continuation of @bottiger1's work https://github.com/bottiger1/sourcemod-safetyhook. I've added safetyhook into the build script of sourcemod. This required enabling C++20, I'm going to investigate if it's possible to downgrade back to C++17. Some other extensions (namely Dhooks) rely on libudis86, and I haven't assessed yet the damage this will have if it's fully removed.
Nevertheless, opening a PR so the work can move forward. I hope to have this ready for a first review today.

Everything seems to work as intended, however safetyhook::Allocator::Error::NO_MEMORY_IN_RANGE seems to be common error when creating detours.

@Kenzzer Kenzzer marked this pull request as ready for review May 19, 2024 17:48
@Kenzzer
Copy link
Member Author

Kenzzer commented May 19, 2024

Ready for review. But DHooks is untested, currently gonna get a few server operators to try it out and report back.
Should udis86 be kept, at least just for dhooks ?

@Kenzzer
Copy link
Member Author

Kenzzer commented May 19, 2024

DHooks works with those changes. However some server operators are reporting safetyhook::Allocator::Error::NO_MEMORY_IN_RANGE error when creating detours, whether it's dhooks or just regular CDetours by TF2/sdktools extensions.

Perhaps @bottiger1 would know more ?

@bottiger1
Copy link
Contributor

bottiger1 commented May 19, 2024

NO_MEMORY_IN_RANGE means that safety hook was unable to alloc memory within 2gb of the hooked code, which is really strange.

What OS are they using? Are they using any special security mods?

On my tests on Linux, the executable and shared libraries like server_srv.so were placed all together which means it was really easy to allocate space within 2gb.

Maybe on Windows, you could try disabling ASLR? Or they have some kind of permission system that is disabling virtualalloc/mmap.

If a solution can't be found then we must switch to polyhook2 which doesn't have the 2gb limitation.

@Kenzzer
Copy link
Member Author

Kenzzer commented May 19, 2024

What OS are they using? Are they using any special security mods?

Linux, Debian 12, it's a dedicated machine. I don't think they're running anything special for allocation, but maybe the provider does.

Windows has only been tested by me so far, and no issues spotted.

@bottiger1
Copy link
Contributor

Well that is very strange, I am using Ubuntu 22.04 which is a very similar setup. I'm assuming that the core sourcemod devs won't find it acceptable to have a crash like this on obscure setups?

@bottiger1
Copy link
Contributor

bottiger1 commented May 19, 2024

Just out of curiosity can you ask him to paste the results of this?

cat /proc/`pgrep srcds_linux`/maps > dump.txt

@ficool2
Copy link

ficool2 commented May 19, 2024

NO_MEMORY_IN_RANGE means that safety hook was unable to alloc memory within 2gb of the hooked code, which is really strange.

What OS are they using? Are they using any special security mods?

On my tests on Linux, the executable and shared libraries like server_srv.so were placed all together which means it was really easy to allocate space within 2gb.

Maybe on Windows, you could try disabling ASLR? Or they have some kind of permission system that is disabling virtualalloc/mmap.

If a solution can't be found then we must switch to polyhook2 which doesn't have the 2gb limitation.

You can merge this PR to get around the 2GB issue: cursey/safetyhook#45

@bottiger1
Copy link
Contributor

bottiger1 commented May 19, 2024

You can merge this PR to get around the 2GB issue: cursey/safetyhook#45

I don't think this will work. I just looked at server_srv. All function alignments are random bytes and not CCs so you can't even find them through a simple string search. All the padding I looked at were only around 3-10 bytes as well which is quite small.

We need to know exactly why it is failing on this persons server. There are different possibilities that have different solutions. Unless you just want to switch to polyhook,

@rowedahelicon
Copy link

Just out of curiosity can you ask him to paste the results of this?

cat /proc/`pgrep srcds_linux`/maps > dump.txt

No output in the dump file
image

@bottiger1
Copy link
Contributor

can you locate the pid number of srcds_linux and put it in place of pgrep srcds_linux

example:
cat /proc/2608793/maps > dump.txt

@rowedahelicon
Copy link

can you locate the pid number of srcds_linux and put it in place of pgrep srcds_linux

example: cat /proc/2608793/maps > dump.txt

Yep, that worked, my bad! Here you go!
dump.txt

@bottiger1
Copy link
Contributor

can you locate the pid number of srcds_linux and put it in place of pgrep srcds_linux
example: cat /proc/2608793/maps > dump.txt

Yep, that worked, my bad! Here you go! dump.txt

So it is crashing on 32 bits? Your maps output looks the same as mine so I have no idea what's wrong.

@rowedahelicon
Copy link

rowedahelicon commented May 19, 2024

can you locate the pid number of srcds_linux and put it in place of pgrep srcds_linux
example: cat /proc/2608793/maps > dump.txt

Yep, that worked, my bad! Here you go! dump.txt

So it is crashing on 32 bits? Your maps output looks the same as mine so I have no idea what's wrong.

In this specific scenario, there is no crash, the detour simply isn't happening. I am using a detour that I can confirm works in a different 32 bit server running a normal SM / Dhooks combo.

Edit: I do have a crash that can happen now since having moved to Debian 12, but it is strictly an issue with the Source Engine, it's most likely not related. It's related to the server precaching transparent textures in the mod_loadalldata function.

@bottiger1
Copy link
Contributor

bottiger1 commented May 19, 2024

Ok I really have no idea why it is not working on your server then.

But I might have a solution that could help.

It looks like the original CDetour just allocates memory with mmap anywhere.

https://github.com/alliedmodders/sourcepawn/blob/8f28fe3e313747185eb479713c3d9ab2b489dde3/vm/code-allocator.cpp#L125

We can achieve the same behavior by editing safetyhook to do the same thing on 32 bits like so.

https://github.com/cursey/safetyhook/pull/77/files

While it is still possible on 32bits to allocate memory farther than 2gb away, srcds always seems to have a function prologue long enough for a jump and never puts relative instructions before the prologue unlike 64 bits, so I think this change should be an acceptable solution, since it has the same behavior and shortcomings as the original cdetour on 32 bits.

@Kenzzer can you compile a version of this for rowdahelicon to test? Or whoever has the linux build environment setup.

Updated file here

https://raw.githubusercontent.com/bottiger1/sourcemod-safetyhook/main/safetyhook.cpp

@Kenzzer
Copy link
Member Author

Kenzzer commented May 19, 2024

sourcemod.zip
Added the change in manually, here's the build. And I'm gonna be unavailable for the rest of the day, I hope this fix does it

@rowedahelicon
Copy link

sourcemod.zip Added the change in manually, here's the build. And I'm gonna be unavailable for the rest of the day, I hope this fix does it

Posting here so I don't forget, this version crashes on startup it seems
https://crash.limetech.org/z3lyr3l3pw5y
https://crash.limetech.org/6djepc56ediq

Not sure which crash is which, they piled up until I went back to my existing build of SM

@bottiger1
Copy link
Contributor

bottiger1 commented May 20, 2024

sourcemod.zip Added the change in manually, here's the build. And I'm gonna be unavailable for the rest of the day, I hope this fix does it

Posting here so I don't forget, this version crashes on startup it seems https://crash.limetech.org/z3lyr3l3pw5y https://crash.limetech.org/6djepc56ediq

Not sure which crash is which, they piled up until I went back to my existing build of SM

Are you sure your server is stable before this change?

The first crash listed points to dhooks at line 122 which corresponds to the unchanged version.

g_pSDKHooks->AddEntityListener(g_pEntityListener);

Line 122 on kenzer's changed version corresponds to a line that is impossible to crash on.

https://github.com/Kenzzer/sourcemod/blob/cdetour_safetyhook/extensions/dhooks/extension.cpp#L122

Unfortunately with no line number or function name on the 2nd crash, there's not much I can deduce there. Kenzer's branch does not even do anything to sdktools except remove a single unused variable.

@bottiger1
Copy link
Contributor

bottiger1 commented May 20, 2024

After a whole afternoon of debugging I finally found the issue. It was caused by incorrect porting to C++17.

https://github.com/Kenzzer/sourcemod/blob/4308a5d0dc49928c42af05e3c6dd66362874edc9/public/safetyhook/safetyhook.cpp#L1020

std::optional has an internal bool that indicates whether it is "null" or not. By assigning values to the members, it doesn't activate the code to set the bool to indicate that the value isn't "null" anymore so the function to return page information would fail.

Someone should carefully check to make sure the other C++17 change isn't causing problems like this, probably best person to do that would be kenzer since he's most familiar with what he changed.

I have fixed this issue and I'm including 32bit/64bit linux versions here. I've tested both archs and there was no crash, but I have not tested if dhooks works because I barely use dhooks and the hooks I use them on are difficult to test solo.

https://github.com/bottiger1/sourcemod/tree/test

https://github.com/bottiger1/sourcemod/releases/tag/safetyhook-test-1

for those of you testing on 64 bit, please remove all plugins that use sdktools. Some of them will crash and must be ported using my sourcemod fork and extension here:

https://github.com/skial-com/port64/

#2159

@rowedahelicon
Copy link

sourcemod.zip Added the change in manually, here's the build. And I'm gonna be unavailable for the rest of the day, I hope this fix does it

Posting here so I don't forget, this version crashes on startup it seems https://crash.limetech.org/z3lyr3l3pw5y https://crash.limetech.org/6djepc56ediq
Not sure which crash is which, they piled up until I went back to my existing build of SM

Are you sure your server is stable before this change?

The first crash listed points to dhooks at line 122 which corresponds to the unchanged version.

g_pSDKHooks->AddEntityListener(g_pEntityListener);

Line 122 on kenzer's changed version corresponds to a line that is impossible to crash on.

https://github.com/Kenzzer/sourcemod/blob/cdetour_safetyhook/extensions/dhooks/extension.cpp#L122

Unfortunately with no line number or function name on the 2nd crash, there's not much I can deduce there. Kenzer's branch does not even do anything to sdktools except remove a single unused variable.

Yes, I can confirm that it was stable. I will also add that the crash related to dhooks was a fluke and showed up during the revert back the older SM build. I will go ahead and play with the builds you've posted to see how things perform.

@Kenzzer
Copy link
Member Author

Kenzzer commented May 20, 2024

-constexpr VmAccess VM_ACCESS_R{.read = true, .write = false, .execute = false};
-constexpr VmAccess VM_ACCESS_RW{.read = true, .write = true, .execute = false};
-constexpr VmAccess VM_ACCESS_RX{.read = true, .write = false, .execute = true};
-constexpr VmAccess VM_ACCESS_RWX{.read = true, .write = true, .execute = true};
+constexpr VmAccess VM_ACCESS_R(true, false,  false);
+constexpr VmAccess VM_ACCESS_RW(true, true,  false);
+constexpr VmAccess VM_ACCESS_RX(true, false,  true);
+constexpr VmAccess VM_ACCESS_RWX(true, true,  true);

-    VmAccess access{
-        .read = (mbi.Protect & (PAGE_READONLY | PAGE_READWRITE | PAGE_EXECUTE_READ | PAGE_EXECUTE_READWRITE)) != 0,
-        .write = (mbi.Protect & (PAGE_READWRITE | PAGE_EXECUTE_READWRITE)) != 0,
-        .execute = (mbi.Protect & (PAGE_EXECUTE | PAGE_EXECUTE_READ | PAGE_EXECUTE_READWRITE)) != 0,
-    };
+    VmAccess access(
+        (mbi.Protect & (PAGE_READONLY | PAGE_READWRITE | PAGE_EXECUTE_READ | PAGE_EXECUTE_READWRITE)) != 0,
+        (mbi.Protect & (PAGE_READWRITE | PAGE_EXECUTE_READWRITE)) != 0,
+        (mbi.Protect & (PAGE_EXECUTE | PAGE_EXECUTE_READ | PAGE_EXECUTE_READWRITE)) != 0
+    );

-    return VmBasicInfo{
-        .address = static_cast<uint8_t*>(mbi.AllocationBase),
-        .size = mbi.RegionSize,
-        .access = access,
-        .is_free = mbi.State == MEM_FREE,
-    };
+    VmBasicInfo retInfo;
+    retInfo.address = static_cast<uint8_t*>(mbi.AllocationBase);
+    retInfo.size = mbi.RegionSize;
+    retInfo.access = access;
+    retInfo.is_free = (mbi.State == MEM_FREE);
+    return retInfo;
// New constructor
+constexpr VmAccess(bool pread, bool pwrite, bool pexecute) : read(pread), write(pwrite), execute(pexecute) {};

Downgrade should be correct.

-    return std::ranges::all_of(desired_addresses, [&](const auto& desired_address) {
-        const size_t delta = (address > desired_address) ? address - desired_address : desired_address - address;
-        return delta <= max_distance;
-    });
+    bool ret = true;
+    for (auto& desired_address = desired_addresses.begin(); desired_address != desired_addresses.end(); desired_address++) {
+        auto& value = *desired_address;
+
+        const size_t delta = (address > value) ? address - value : value - address;
+        ret &= (delta <= max_distance);
+    }
+    return ret;

According to https://en.cppreference.com/w/cpp/algorithm/ranges/all_any_none_of
all_of

  • returns true, if All True None False
  • returns false if Some True Some False
  • returns false if None True, Some False
  • returns true if None True, None False (Empty)
    Downgrade should be correct.
-concept FnPtr = requires(T f) { std::is_pointer_v<T>&& std::is_function_v<std::remove_pointer_t<T>>; };
+typedef void* FnPtr;

concept only ensures some traits (similar to Rust) on FnPtr. Concepts don't exist in C++17, so we downgrade to a regular void*

-        m_traps.insert_or_assign(from, TrapInfo{.from_page_start = align_down(from, 0x1000),
-                                           .from_page_end = align_up(from + len, 0x1000),
-                                           .from = from,
-                                           .to_page_start = align_down(to, 0x1000),
-                                           .to_page_end = align_up(to + len, 0x1000),
-                                           .to = to,
-                                           .len = len});
+        TrapInfo info;
+        info.from_page_start = align_down(from, 0x1000);
+        info.from_page_end = align_up(from + len, 0x1000);
+        info.from = from;
+        info.to_page_start = align_down(to, 0x1000);
+        info.to_page_end = align_up(to + len, 0x1000);
+        info.to = to;
+        info.len = len;
+        m_traps.insert_or_assign(from, info);

Downgrade should be correct.

+constexpr VmAccess() : read(true), write(true), execute(true) {};

VMAccess was missing a default constructor for VMAccess{} which only works on C++20

And finally we have

-            info = {
-                .address = reinterpret_cast<uint8_t*>(last_end),
-                .size = start - last_end,
-                .access = VmAccess{},
-                .is_free = true,
-            };
+          info->address = reinterpret_cast<uint8_t*>(last_end);
+            info->size = start - last_end;
+            info->access = VmAccess();
+            info->is_free = true;

Which was the only invalid change, due to std::optional. And that code was only available on Linux, explaining why the changes worked for me on Windows, but not for him on Linux.

@Kenzzer
Copy link
Member Author

Kenzzer commented May 20, 2024

for those of you testing on 64 bit, please remove all plugins that use sdktools. Some of them will crash and must be ported using my sourcemod fork and extension here

It's worth mentioning that this has nothing to do with CDetour upgrade, and is off scope this PR. The issue is with plugins performing SDKCalls with raw pointers (treating them as 32bits ints). This shouldn't be considered when reviewing this PR.

@Kenzzer
Copy link
Member Author

Kenzzer commented May 20, 2024

sourcemod.zip
Here's a new build for those who wish to text on Linux 32 bits (TF2).

Edit: That fix did it, I'm able to launch SM fine on WSL (Debian 11). And Detours are working properly, including DHooks.

@Kenzzer
Copy link
Member Author

Kenzzer commented May 20, 2024

It seems dhooks is creating crash (unsurprisignly). I'm going to re-add udis86 lib for it, no point blocking the PR just because dhooks doesn't behave. I'll make the commits in a few.

@Nukoooo
Copy link

Nukoooo commented May 20, 2024

FYI the equivalent of std::ranges::all_of is std::all_of, most of the functions in range library (std::ranges::) can easily port to C++17 by just removing ranges::

@bottiger1
Copy link
Contributor

It's worth mentioning that this has nothing to do with CDetour upgrade, and is off scope this PR. The issue is with plugins performing SDKCalls with raw pointers (treating them as 32bits ints). This shouldn't be considered when reviewing this PR.

I mentioned this because one of rowdehelicon's crash dumps was related to sdktools even if it was only on 32 bits.

https://crash.limetech.org/6djepc56ediq

It is already hard enough to debug these issues without having unrelated stuff crashing and causing people to blame or chase the wrong leads.

@rowedahelicon
Copy link

The latest build from @Kenzzer seems to be working fine!

@Kenzzer
Copy link
Member Author

Kenzzer commented May 20, 2024

sourcemod_linux_tf2_x86.zip
sourcemod_windows_tf2_x86.zip
Including myself, that's 3 server operators so far confirming the stability of safetyhooks as replacement for udis86. If anybody feels like trying out the changes for themselves, here are unofficial builds of the PR.

@psychonic psychonic merged commit e07c120 into alliedmodders:master May 21, 2024
4 checks passed
@bottiger1
Copy link
Contributor

@Kenzzer
Just a heads up in case you or anyone tries to attempt to use this to fix dhooks. Unfortunately just putting safetyhook in there is not even going to fix 10% of the issues with it.

Firstly, I don't think this is correct usage of a safetyhook inline hook. The bridge function doesn't have a standard calling convention as its 100% assembly, while safetyhook expects normal cdecl with 1 argument.

void hooked_add_42(SafetyHookContext& ctx)

I think you might just be able to convert the entire bridge function to normal C without any assembly.

https://github.com/Kenzzer/sourcemod/blob/167ab557d2c7f44cadae931561cb83bf67289b84/extensions/dhooks/DynamicHooks/hook.cpp#L73

There are 32 bit registers used everywhere that need to be changed.

masm.movl(eax, Operand(esp, 0));

You'd also need to define the 64 bit calling conventions like this here:

https://github.com/alliedmodders/sourcemod/tree/e3734803f0d4f431d5f3597f5050e48c3d46f8e0/extensions/dhooks/DynamicHooks/conventions

@Kenzzer
Copy link
Member Author

Kenzzer commented May 21, 2024

The bridge function doesn't have a standard calling convention as its 100% assembly

That's the intention, it cannot have a calling convention since the prologue is about saving the registers. The bridge eventually jumps to the handler function which has a call conv.

void hooked_add_42(SafetyHookContext& ctx)
I think you might just be able to convert the entire bridge function to normal C without any assembly.

Good info. I'll put that to the test, when I tackle the second part of x64 DHooks

There are 32 bit registers used everywhere that need to be changed.

Wasn't trying to make this work on x64. The priority atm was just x86, when it comes to dhooks. The ext can't be compiled anyways

@Kenzzer Kenzzer deleted the cdetour_safetyhook branch June 2, 2024 12:27
@Kenzzer Kenzzer mentioned this pull request Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants