Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apple M1 Support for MacOS #9441

Merged
merged 20 commits into from
May 24, 2021
Merged

Apple M1 Support for MacOS #9441

merged 20 commits into from
May 24, 2021

Conversation

skylersaleh
Copy link
Contributor

This pull request adds support for compiling Dolphin for ARM on MacOS so that it can
run natively on the M1 processors without running through Rosseta2 emulation
providing a 30-50% performance speedup and less hitches from Rosseta2.

It consists of several key changes:

  • Adding support for W^X allocation(MAP_JIT) for the ARM JIT
  • Adding the machine context and config info to identify the M1 processor
  • Additions to the build system and docs to support building universal binaries
  • Adding code signing entitlements to access the MAP_JIT functionality
  • Updating the MoltenVK libvulkan.dylib to a newer version with M1 support

Thanks everyone on IRC for the help in making this change!

BuildMacOSUniveralBinary.py Outdated Show resolved Hide resolved
Readme.md Outdated Show resolved Hide resolved
Source/Core/Core/PowerPC/JitArm64/Jit.cpp Outdated Show resolved Hide resolved
BuildMacOSUniveralBinary.py Outdated Show resolved Hide resolved
@larsenv
Copy link

larsenv commented Jan 13, 2021

@skylersaleh Great job! My friend @spotlightishere was doing the exact same thing. How dare you steal his thunder!

https://github.com/spotlightishere/dolphin

He's been working on this for months, but the thing is that he never could get it running at full speed on ARM. Did you?

Oh well. If it's working at full speed, then maybe both of your efforts can be combined.

Copy link
Contributor

@sepalani sepalani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some remarks regarding the python code:

  • The string "".format() method should be used instead of multiple concatenations with the + operator.
  • The use of os.system should be avoided, you might consider using the os.execlpe/os.execvpe variants.
  • The code should be PEP8 compliant

BuildMacOSUniveralBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniveralBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniveralBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniveralBinary.py Outdated Show resolved Hide resolved
@uncletrunks
Copy link

uncletrunks commented Jan 15, 2021

From a little bit of testing I can't seem to get your fork to acknowledge input from the Switch Pro Controller connected over USB-C, which (on a release build at least) works on Rosetta 2. The SDL input doesn't seem to acknowledge any input at all and the Input/Pro Controller/0 option seems to particularly freak out on inputs from the sticks. Not sure if this happens with other controllers, I don't have any on hand at the moment to test with.

Editing to attach a picture of the input configuration if that helps at all. It definitely looks like something odd is happening with how data is being parsed from the controller.
image

@skylersaleh
Copy link
Contributor Author

From a little bit of testing I can't seem to get your fork to acknowledge input from the Switch Pro Controller connected over USB-C, which (on a release build at least) works on Rosetta 2. The SDL input doesn't seem to acknowledge any input at all and the Input/Pro Controller/0 option seems to particularly freak out on inputs from the sticks. Not sure if this happens with other controllers, I don't have any on hand at the moment to test with.
Editing to attach a picture of the input configuration if that helps at all. It definitely looks like something odd is happening with how data is being parsed from the controller.
image

It might be an issue with SDL or the HIDAPI? In a similar line of thought, my Dualshock 4 is seen by PCSX2 but will not accept any input from it under Rosetta2 - but works fine on native Intel.

I wish I could test for myself, but I'm having issues with post processing not finding the HIDAPI.

I think these problems may be caused by missing the SDL2 library for a single architecture. The build of SDL2 distributed from package managers such as homebrew only support a single architecture. So, you must make sure to explicitly install both the ARM and x64 variants and insure they are detected when Cmake is ran for both the x64 and the ARM build.

Otherwise SDL2 will only be linked for one of the architectures and you will end up with the behavior of only getting controller input either under Rosseta or native execution but not under both.

@uncletrunks
Copy link

uncletrunks commented Jan 15, 2021

I ended up resolving the issue (at least for SDL inputs, native Apple inputs I haven't touched at all) by doing brew install sdl --HEAD to update to a bleeding edge version. It seems like Sam Lantinga did some substantial work within the last 12 hours or so fixing errors with how they were handling switch pro controllers, so the only input that Dolphin was listing was Input/Pro Controller/0 which is ostensibly broken.

image

Once I sorted that out it's working amazing. Great stuff.

Copy link
Contributor

@sepalani sepalani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some additional remarks regarding the python code:

  • All ; at the end of line can be removed
  • There should be a space after , in function parameters
  • The code should be PEP8 compliant

You should run a PEP8 linter on the python code, it will help you find and sometimes fix these nitpicks for you.

Regarding the argument handling, I'm fine with argparse but the more architectures you want to support or add features/options for, the more you'll have to write code and it will become cumbersome quite quickly. Having a config file is both human readable, editable and reusable. If the config file is properly structured you'll also have less code for new archs.

BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
# 4) Symlinks are created in the destination tree to mirror the hierarchy in
# the source trees

def recursiveMergeBinaries(src0,src1,dst):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if this function could be generalised to any number of binaries, not just 2. An approach i would take would be using positional arguments recursiveMergeBinaries(dst, *args) where *args are source trees.

Feel free to ignore this comment.

BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
@skylersaleh
Copy link
Contributor Author

skylersaleh commented Jan 24, 2021

FifoCI detected that this change impacts graphical rendering.

From IRC, we think this is a false positive and is safe to ignore.

@skylersaleh
Copy link
Contributor Author

I seem to be getting a error upon running this script. I have tried to do many things and this keeps happening constantly
build_error.txt

Hi Awsomenes,

From the posted log, it looks like you are getting the error "arch: posix_spawnp: cmake: Bad CPU type in executable" when the command 'arch -x86_64 cmake ../../ -G Xcode' is executed inside of your terminal.

This likely means that your cmake version that you installed is not a universal binary (so it doesn't have both x86 and arm versions). You can download a universal cmake here: https://cmake.org/download/ To build a universal binary, you need both a fully working x86 and arm64 build environment with all the dependencies.

Alternatively, you could just build an arm64 version without using that build script.

    mkdir build
    cd build
    cmake ..
    make

Copy link
Contributor

@sepalani sepalani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall code quality seems better, IMHO.

Comments should have their first letter capitalized if it's a sentence. Then, comments should be added to describe what is the purpose of the code, not what the code is doing (unless when it's really not obvious).

BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
@uvesten
Copy link

uvesten commented Jan 31, 2021

Runs really well for me, at least using Mario Kart double dash. Thanks!

@skylersaleh
Copy link
Contributor Author

skylersaleh commented Feb 2, 2021

@skylersaleh

Just a heads up. Something was changed in 11.2 and now the following error message is triggered: "WriteProtectMemory failed! mprotect: Permission denied".

Thanks for the heads up! I'll try to reproduce this.

EDIT:

I was able to reproduce this. It looks like 11.2 no longer allows for changing the access protection settings on a page that was marked as executable. I will push a workaround shortly.

EDIT 2:

I have pushed the workaround to the repo.

@Jefffr
Copy link

Jefffr commented Feb 2, 2021

Hello,
not running for me, see txt file.

log-dolphin-m1.txt

@pizuz
Copy link

pizuz commented Feb 2, 2021

@Jefffr you might want to put this log into a text file, instead.

EDIT: Thanks :)

Analytics:
- Incorporated fix to allow the full set of analytics that was recommended by
  spotlightishere

BuildMacOSUniversalBinary:
- The x86_64 slice for a universal binary is now built for 10.12
- The universal binary build script now can be configured though command line
  options instead of modifying the script itself.
- os.system calls were replaced with equivalent subprocess calls
- Formatting was reworked to be more PEP 8 compliant
- The script was refactored to make it more modular
- The com.apple.security.cs.disable-library-validation entitlement was removed

Memory Management:
- Changed the JITPageWrite*Execute*() functions to incorporate support for
  nesting

Other:
- Fixed several small lint errors
- Fixed doc and formatting mistakes
- Several small refactors to make things clearer
- Added MacOS version checking around MAP_JIT to prepare code for x86 MAP_JIT
Merges two nested #ifndefs into a single #if around hw cap includes.
- Fixed error that was causing the hardened runtime from being enabled
- Refactored BuildMacOSUniversalBinary.py based on code style recommendations
In MacOS 11.2 mprotect can no longer change the access protection settings of
pages that were previously marked as executable to anything but PROT_NONE. This
commit works around this new restriction by bypassing the mprotect based write
protection and instead relying on the write protection provided by MAP_JIT.
- Fixed a typo in the Readme.md
- Made the recursiveMergeBinaries function handle divergent binary file trees
  better.
- Enabled MAP_JIT on x86_64 after confirming that pthread_jit* calls are only
required for MAP_JIT pages on Apple Silicon
Added a Common::JITPageWriteDisableExecuteEnable() that could be missed when
a memory exception is triggered by the running game.
Removed the unavailable CPU core dialog box that asked users to change their
selected CPU core to one that is available. Instead, Dolphin now just overrides
the core to the default, and logs that it performed the override.
- Updated MoltenVK library in external to 1.2.170.0 (fixes swapchain crash)
- Removed -mssse3 flag from arm64 builds
- Removed -march=core2 from arm64 builds
This change adds the -G and --build_type arguments to the
BuildMacOSUniversalBinaray.py build script to allow the CMake generator and
build type to be specified for universal binary builds.

The defaults are:
-G "Unix Makefiles"
--build_type "Release"
1) Place information about universal builds after single architecture builds
   since universal builds are harder to setup the environment for.
2) Specify that new arguments should be provided instead of direct script
   modification for universal builds.
Modify the build script to use CMake's cross compilation infrastructure instead
of running CMake natively in x86_64 and arm64 mode to configure the two project
files. This should fix the builds with CMake executables installed through
homebrew and should pave the way for building universal binaries on x86_64
systems.
Incorporated changes suggested by Kode54 to use the CMake prefix path, instead
of relying on pkg-config directory paths.

Added the prefix paths from the other architectures build to the CMake ignore
path so that CMake doesn't pick up libraries from the wrong architecture if
they are in the user's search path.
This commit fixes a regression in 2ba88d5 that
would cause an app bundle to not be resigned after merging the two single
architecture builds.

Also, applies formatting suggestions from Leo Lam
Added RAII wrapper around the the JITPageWriteEnableExecuteDisable() and
JITPageWriteDisableExecuteEnable() to make it so that it is harder to forget to
pair the calls in all code branches as suggested by leoetlino.
@skylersaleh skylersaleh force-pushed the master branch 2 times, most recently from bd89767 to 5ee8239 Compare May 22, 2021 22:27
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
BuildMacOSUniversalBinary.py Show resolved Hide resolved
BuildMacOSUniversalBinary.py Outdated Show resolved Hide resolved
Adds a step to detect when the Intel and arm64 build trees cannot be merged
safely. This can occur when each side has files/folders that are named the same
but are of different types or symlinks that are the same name but need to point
to different locations for each architecture.

Before this change, this would just fail silently.
Adds a new PlatformID for universal builds. This will allow single architecture
builds to be updated through the single architecture path, and universal builds
to be updated with universal builds.
@leoetlino leoetlino merged commit 5167192 into dolphin-emu:master May 24, 2021
@MayImilae
Copy link
Contributor

@skylersaleh It's been a while, but the new M1 silicon is here and I now own a M1 Max MacBook! I'd like to recreate the M1 tests on it but I'm having difficult finding the data, both the screenshots and spreadsheet. Could you send me the links again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.