Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

W^X support #54954

Merged
merged 8 commits into from
Jul 11, 2021
Merged

W^X support #54954

merged 8 commits into from
Jul 11, 2021

Conversation

janvorli
Copy link
Member

This change is the last part of enabling the W^X support. It adds the
actual executable allocator that handles all double mapped memory
allocations and creating the writeable mappings.

The platform specific functionality is placed in a new minipal that is
going to be a basis for future removal of Windows APIs usage from the
native runtime.

The last state of the change was tested on all the platforms that we
support using coreclr pri 1 tests with both the W^X enabled and disabled
using the COMPlus_EnableWXORX variable.

The debugger changes were tested using the managed debugger testing
suite on Windows x64, x86 and on Apple Silicon so far. Further testing
on other platforms is in progress.

There are also some fixes to the previously placed executable writer holders
in the debugger code based on the actual managed debugger tests results.

I've left the debugger code in debugger\ee to use RWX pages for breakpoints
and singlestepping helpers.

Contributes to #50391

This change is the last part of enabling the W^X support. It adds the
actual executable allocator that handles all double mapped memory
allocations and creating the writeable mappings.

The platform specific functionality is placed in a new minipal that is
going to be a basis for future removal of Windows APIs usage from the
native runtime.

The last state of the change was tested on all the platforms that we
support using coreclr pri 1 tests with both the W^X enabled and disabled
using the COMPlus_EnableWXORX variable.

The debugger changes were tested using the managed debugger testing
suite on Windows x64, x86 and on Apple Silicon so far. Further testing
on other platforms is in progress.
@janvorli janvorli added this to the 6.0.0 milestone Jun 30, 2021
@janvorli janvorli requested a review from jkotas June 30, 2021 10:30
@janvorli janvorli self-assigned this Jun 30, 2021
}

#ifdef _DEBUG
// if (ShouldInjectFaultInRange())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas do you think it is worth preserving this fault injection? I would need to either read the env var directly in this minipal or pass its state down from the executable allocator.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to. The fault injection was not ever used in .NET Core as far as I know.

}
}

// STRESS_LOG7(LF_JIT, LL_INFO100,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas using stresslog from here would require some clunky stuff. I don't remember ever needing this info from the stresslog in the past years. I was going to remove these, but I wanted to know your opinion first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

The memfd_create was introduced in 13.0 and it is equivalent to the
shm_open(SHM_ANON, ...)
//
// Executable code
//
RETAIL_CONFIG_DWORD_INFO(EXTERNAL_EnableWXORX, W("EnableWXORX"), 1, "Enable W^X for executable memory.");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be flipped to default to 0 after the CI is green.

Copy link
Member

@mangod9 mangod9 Jul 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps a better name for this? EnableWXORX feels difficult to parse visually.. something like EnableWriteXorEx?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to EnableWriteXorExecute

@janvorli
Copy link
Member Author

The macOS legs are failing due to what seems to be an OOM - mmap failing to map a block of memory and the core file being about 3GB large.
The installer legs on Windows x86 / x64 are failing in the ActivateClass test.
I am looking into those.

@janvorli janvorli requested a review from hoyosjs June 30, 2021 19:15
return result;
}

for (BlockRX* pBlock = m_pFirstBlockRX; pBlock != NULL; pBlock = pBlock->next)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would likely be worth optimizing using some better data structure than a linked list to better handle apps that JIT a lot of code. Testing this with aspnet plaintext benchmark, the max length of this list was about 100 and the average search steps to find the requested block was 27. The list has the most recent allocated block in the head, so for the case of allocating memory block and then filling it with code, the current way would perform the best as the block we try to map as rw would always be the first in the list. But for modifications of old allocated stuff, like patching stubs etc., we can end up doing many search steps.
However, the plan for post 6.0 is to try to eliminate as much of the patching of code as we can e.g. by splitting stubs into fixed code and writeable data, so I didn't want to optimize this prematurely.

@janvorli
Copy link
Member Author

The OSX failure due to OOM is strange. On my local device, the test that fails actually allocates just about 6.5MB of executable memory and all the summed requests to map RW blocks are about 12MB. So I don't really see how the mmap(NULL, size, PROT_NONE, MAP_JIT | MAP_ANON | MAP_PRIVATE, -1, 0); with size 0x10000 could fail.

@@ -201,6 +201,10 @@ End
Crst Exception
End

Crst ExecutableAllocatorLock
Unordered
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really safe to have it unordered? Unordered means prone to deadlocks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found a way to define the lock so that it doesn't hit the ordering asserts. If I just removed the Unordered, then I get e.g. this assert:

Assert failure(PID 12572 [0x0000311c], Thread: 51440 [0xc8f0]): Consistency check failed: Crst Level violation: Can't take level 0 lock CrstExecutableAllocatorLock because you already holding level 0 lock CrstLoaderHeap

All I want here is a leaf lock. The methods in the allocator can be called from all kinds of places, so I don't know what locks it is going to be called under.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use CrstLeafLock. Or you can add AcquiredAfter CrstLoaderHeap for CrstExecutableAllocatorLock

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LeafLock doesn't work, with it it still complains with the same assert. Fortunately, the AcquiredAfter LoaderHeap seems to work, so I'll use that one.

}

#ifdef _DEBUG
// if (ShouldInjectFaultInRange())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to. The fault injection was not ever used in .NET Core as far as I know.

@@ -0,0 +1,4 @@
add_library(coreclrminipal
STATIC
doublemapping.cpp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you have here is fine, for now at least. I have been wondering whether we should build minipals like this in C to make easy to share with Mono when we need to, similar to what we are doing in libraries shims.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the physical location of minipal could be src/native/minipal (we can also move src/native/common/getexepath.h there). Maybe it's easier to do it now since there is only doublemapping.cpp file and it's basically C compatible code (except for the scope resolution operator ::). :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the possibility of using this minipal by Mono as very remote. My vision is that it will gradually become a tiny replacement for the coreclr PAL, eventually allowing us to remove all Windowsisms from the coreclr. So I see this as a kind of inseparable part of the coreclr itself. While this double mapping stuff looks like something that Mono might possibly want, the other stuff that will go into the minipal is most likely going to be very coreclr specific. So I'd prefer keeping it in the coreclr folder until there is a reason to share it.
As for converting it to C, we would need to prevent inlining of all the methods to be safe. Inlining C code into C++ is potentially dangerous e.g. due to c++ pointer aliasing. See dotnet/coreclr#6801 for some details. We have converted all coreclr PAL files to c++ to avoid this issue in the past.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we could still compile the minipal as C++ when building it for coreclr even if it was written in C, to avoid any C/C++ combination issues. So please disregard my comment on the inlining.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is fine to delay this until it is actually needed. I have mentioned this since I am regularly being asked by our leadership about what we can do share more code between CoreCLR and Mono. PAL layers are one of the easier opportunities.

@@ -238,15 +238,10 @@ extern "C" FCDECL2(Object*, ChkCastAny_NoCacheLookup, CORINFO_CLASS_HANDLE type,
extern "C" FCDECL2(Object*, IsInstanceOfAny_NoCacheLookup, CORINFO_CLASS_HANDLE type, Object* obj);
extern "C" FCDECL2(LPVOID, Unbox_Helper, CORINFO_CLASS_HANDLE type, Object* obj);

#if defined(TARGET_ARM64) || defined(FEATURE_WRITEBARRIER_COPY)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is ETW::MethodLog::StubsInitialized method that gives names to the adhoc dynamically generated helpers in sampling traces so that they do not show up as broken stacks. We may want to start reporting the write barriers using this event.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's nice, I'll add it.

@mangod9
Copy link
Member

mangod9 commented Jul 2, 2021

@tommcdon and @JulieLeeMSFT as FYI.

@@ -499,6 +499,7 @@ struct ComMethodTable
// Accessor for the IDispatch information.
DispatchInfo* GetDispatchInfo();

#ifndef DACCESS_COMPILE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assume this is unrelated to w^x change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are related. They exclude methods that are not used by DAC and that would otherwise unnecessarily pull in the ExecutableWriterHolder into DAC.

@@ -201,6 +201,10 @@ End
Crst Exception
End

Crst ExecutableAllocatorLock
AcquiredAfter LoaderHeap LeafLock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No lock should be nested inside LeafLock.

This means that some uses of LeafLock are not valid anymore, and should get its own level.

Also update the ordering list of the ExecutableAllocatorLock
@alpencolt
Copy link

@janvorli does the double mapping map to the same memory and doesn't cause growth of memory consumption?
Do you have any docs or description of this feature?

@janvorli
Copy link
Member Author

janvorli commented Jul 9, 2021

@alpencolt it doesn't map to the same virtual address, it uses a different one for writing and a different one for executing. However, the writeable mappings are temporary and live only for a very short period of time, so they don't result in incremental virtual memory usage growth. Their lifetime match the lifetime of the ExecutableWriterHolder instances that are used as stack only.
There is no public doc on the feature yet, but you have a good point that we should have one.

Also allocate LoaderHeapFreeBlock from regular heap.
@janvorli janvorli closed this Jul 11, 2021
@janvorli janvorli reopened this Jul 11, 2021
@am11
Copy link
Member

am11 commented Jul 11, 2021

gcc leg is failing due to #55299 (comment).

@janvorli
Copy link
Member Author

@jkotas I have addressed all the feedback (except for making the minipal C, which I'll postpone) and fixed a few issues that I've found based on the CI tests. Can you please take another look so that we can get it in for the preview 7?
The PR is green except for the GCC leg that fails due to an unrelated old known issue in JIT code. So I am going to add one more commit that sets the W^X default to "off".

@janvorli janvorli merged commit 6a47ecf into dotnet:main Jul 11, 2021
@janvorli janvorli deleted the wxorx-8 branch July 11, 2021 14:47
@ghost ghost locked as resolved and limited conversation to collaborators Aug 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants