Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add win32-aarch64 support #1264

Merged
merged 9 commits into from
Jan 19, 2021
Merged

Conversation

tresf
Copy link
Contributor

@tresf tresf commented Nov 6, 2020

TODO

Changes

  • Updates native/libffi to latest version, adding Windows arm64 support 2bf1593
  • Disables DLL callback functionality for win43-aarch via ASMFN_OFF (no mingw support)
  • Disables DLL callback tests for win32-aarch64 2ff29eb
  • Fixes marshal test for win32-aarch64 e3ad29d
  • Patches msvcc.sh to handle the following scenarios/regressions:
    • Re-adds mvvcc.sh's cygpath calculation for cl.exe 112d76d
    • Quotes mvvcc.sh's include (-I) files with spaces (e.g. -I"C:\Program Files\AdoptOpenJDK11\...\include) 112d76d
    • Brings back mvvcc.sh's eval for assembly files 112d76d
  • Adds win32-aarch64 native build option
    • Toggled via:
       ant native -Dos.prefix=win32-aarch64
    • Sets ${make.ARCH}=aarch64, reusing Android's property
    • Produces new artifact lib/native/win32-aarch64.jar
    • Tested build using MSVC v142 - VS 2019 C++ ARM64 build tools (v14.27) on Windows 10 x86_64
    • Tested binary using Surface Pro X and openjdk-aarch64 from Microsoft

Steps

See the updated www/WindowsDevelopmentEnvironment.md


Hidden, outdated
  1. Install and configure ant for your system

  2. Install cygwin per Windows steps

  3. Install Visual Studio 2019 with ARM build tools (other versions should also work).

  4. Disable CRLF globally for git (use with caution, this will affect all projects. Alternately, you can skip this step and use dos2unix on each failing file)

    git config --global core.autocrlf false
    git config --global core.eol lf
    
    # Fixes error: AC_CONFIG_MACRO_DIRS([m4]) conflicts with ACLOCAL_AMFLAGS=-I m4
    # See also: https://stackoverflow.com/q/47582762
  5. Clone this PR

    git clone -b win-arm64 https://github.com/tresf/jna --depth 10
    cd jna
  6. From a Command Prompt, configure MSVC to build using arm64

    "%ProgramFiles(x86)%\Microsoft Visual Studio\2019\Community\Common7\Tools\VsDevCmd" -arch=arm64 -host_arch=amd64
    
    # or on native aarch64 use "-host_arch=x86"
  7. Append cygwin to your path (Append, DO NOT prepend, if you prepend, link.exe will fail!)

    set PATH=%PATH%;C:\cygwin64\bin\
    
    # or on native aarch64, "C:\cygwin\bin"
  8. Fire the build:
    ⚠️ Note: Windows Defender real-time protection makes this very, very slow. Consider disabling or adding an exception to the jna directory.

    ant native -Dos.prefix=win32-aarch64
    
    # or on native aarch64, just type "ant"
  9. Clean between errors:

    ant clean
  10. Optionally, enable debugging

    • Change verbose= to verbose=1 in native/libffi/msvcc.sh
    • Append -DEXTRA_MAKE_OPTS="--debug=v" to ant command

@tresf
Copy link
Contributor Author

tresf commented Nov 11, 2020

One CI failure seem to be related to what was described as "flaky Travis behavior" on the mailing list, specifically X11Test.java:26: error: cannot find symbol com.sun.jna.StructureFieldOrderInspector;. This code is not touched in this PR.

Another failure appears to be a missing tool, textinfo, which I'll attempt to add as an additional CI dependency.

@matthiasblaesing
Copy link
Member

I did not yet really check, but I managed to get the build for aarch64 going. I have now three different dlls for windows, for two of these the file tool can tell me the architecture, for one not. This is to be expected, as aarch64 is very new and I assume file was not yet updated.

Unittests for x86-64 come back clean, but unittests for x86 result in an error:

    [junit] Testcase: testStdCallCallbackStackAlignment(com.sun.jna.win32.W32StdCallTest):      FAILED
    [junit] stdcall callback did not restore the stack pointer
    [junit] junit.framework.AssertionFailedError: stdcall callback did not restore the stack pointer
    [junit]     at com.sun.jna.win32.W32StdCallTest.testStdCallCallbackStackAlignment(W32StdCallTest.java:180)
    [junit]

In the dicussion for #1259, some patches were mentioned, that match this: #1259 (comment).

You can run the build + unittests by invoking:

ant

On x86-64 you can switch between 32bit and 64bit by switching the building JDK. When run on aarch64 this should also be possible to run through the build and check the unittests.

Thank you for the work. It is great, that you took this on.

@tresf
Copy link
Contributor Author

tresf commented Nov 12, 2020

AppVeyor steps for x86, arm64 added via 0ae7234. x86 is failing at the same spot as identified in #1264 (comment).

Logs: https://ci.appveyor.com/project/tresf/jna/builds/36266553

@matthiasblaesing
Copy link
Member

For the texinfo problem on travis this should fix it: matthiasblaesing@b698710 (test on travis-ci on amd64 are currently dog-slow, so this is deduced from the successful run on arm64)

@tresf
Copy link
Contributor Author

tresf commented Nov 12, 2020

Thanks, added.

@tresf
Copy link
Contributor Author

tresf commented Nov 13, 2020

You can run the build + unittests by invoking:

ant

Done. Please see test results:

image

Detailed logs can be found here: https://gist.githubusercontent.com/tresf/dab4ebf5459f3370a33c0c0c659e5d4e/raw/788512b93e2a58eb7cc18aa171d5da972fdcdbec/win32-aarch64-jna-tests.log

@matthiasblaesing thoughts?

Some updates to the original build instructions:

  • No changes to the build system were needed. With the an aarch64 build of Java ant works just fine.
  • Instructions have been updated with instructions for building on Windows for ARM64 (mostly just switching the toolchain over to x86 and removing the os.prefix).
  • Building on native ARM64 hardware is quite slow (2x - 3x slower than Intel) as it uses MSVC compiler and cygwin through x86 emulation to cross-compile aarch64 binaries. Added note about disabling Windows Defender to speed this up a bit.

@matthiasblaesing
Copy link
Member

For the arguments marshal tests, this should be the relevant commit, that fixed it in the past:

For the test failure in "GetModuleHandleEx(fptr) failed":

If I remember correctly you disabled that code for arm64 didn't you? If so it needs to be recorded in Platform: com.sun.jna.Platform.HAS_DLL_CALLBACKS. Currently windows is mapped to true. This needs an additional constraint:

        HAS_DLL_CALLBACKS = osType == WINDOWS;

For the crash in LoadLibraryTest a look into: C:\Users\Owner\jna\hs_err_pid18788.log would be interesting.

The error in testLoadFromUnicodePath is more or less expected - it works only on the "right" OS encoding.

@tresf
Copy link
Contributor Author

tresf commented Nov 15, 2020

The error in testLoadFromUnicodePath is more or less expected - it works only on the "right" OS encoding.

I tried set _JAVA_OPTIONS=-Dfile.encoding=UTF-8 and the unit tests pick it up, but they still fail. I'm sure this is well-known so I won't waste much time on it unless needed by the project.

marshal tests [...] adaption for ARM64: 5c889be

Perfect, thanks. Patched e3ad29d.

com.sun.jna.Platform.HAS_DLL_CALLBACKS needs an additional constraint:

Thanks, patched via 2ff29eb.

For the crash in LoadLibraryTest a look into: [crash log] would be interesting.

Here's the crash log: https://gist.githubusercontent.com/tresf/caca6d16bfde240c87d17d72194564c2/raw/43d9c8552dc627894b877206e11229ad68bec3f8/hs_err_pid19056.log

If I need to rebuild with debug symbols, I'd be happy to. I'm not very familiar with this process, but I'd be happy to try.

New log here: https://gist.githubusercontent.com/tresf/0f183d54d504134c953a190db46b8ffb/raw/ae7cac174641339017f1058842d25666cbcd8497/win32-aarch64-jna-tests2.log

New report:

image

@tresf
Copy link
Contributor Author

tresf commented Nov 15, 2020

Update. The unit tests are in better shape after updating to the latest win32-aarch64 snapshot from Microsoft, now called 16-ea, which specifically addresses unrecoverable crashes, quoting:

Contains fix for VEH exception handling, so that HotSpot will resume exception handling instead of bailing out, so that a potential SEH handle of a native library has a chance to catch it. Instead, we install an unhandled exception handle.

... more release notes are available in the releases area. Note, at the time of writing this the newest release is for macOS (notice the .tar.gz extension), so please ignore that build specifically.

New results attached. The only test that seems to be failing now is the unicode one.

Logs: https://gist.github.com/tresf/07c4de8cbbf17d87997da261dae40d7b

Test summary:
image

@tresf
Copy link
Contributor Author

tresf commented Nov 16, 2020

Note, the following upstream issues/PRs seem to be very similar to the x86 issues we see in the testStdCallCallbackStackAlignment. The first (and oldest is JNA's own from @twall).

Listed as a code block to avoid a bunch of crosslinking.

https://github.com/libffi/libffi/issues/198 - win32-x86 structure handling broken 
https://github.com/libffi/libffi/issues/215 - win32 x86 stdcall closure: incorrectly restored stack after closure call 
https://github.com/libffi/libffi/pull/514 - Clang on windows uses 4 byte stack alignment, not 16
https://github.com/libffi/libffi/pull/378 - Fix FFI_STDCALL ABI
https://github.com/libffi/libffi/pull/465 - Fix Win32 stdcall closure

I adapted/cherry-picked what I think is the same x86 guard. The diff is below. With this applied, testStdCallCallbackStackAlignment passes. I'm not sure if MSVC check is warranted too. This is out of my area of expertise.

diff --git a/lib/native/win32-x86.jar b/lib/native/win32-x86.jar
index 54fab71e8..025ca3e89 100644
Binary files a/lib/native/win32-x86.jar and b/lib/native/win32-x86.jar differ
diff --git a/native/libffi/src/x86/ffi.c b/native/libffi/src/x86/ffi.c
index 5f7fd81d9..98c09ba6a 100644
--- a/native/libffi/src/x86/ffi.c
+++ b/native/libffi/src/x86/ffi.c
@@ -181,6 +181,9 @@ ffi_prep_cif_machdep(ffi_cif *cif)
     {
       ffi_type *t = cif->arg_types[i];

+#if defined(_M_IX86)
+      if (cif->abi != FFI_STDCALL)
+#endif
       bytes = FFI_ALIGN (bytes, t->alignment);
       bytes += FFI_ALIGN (t->size, FFI_SIZEOF_ARG);
     }

@tresf
Copy link
Contributor Author

tresf commented Nov 17, 2020

The above patch seems to work with both mingw and msvc builds, I'm going to add it to this PR. Next, I'll rebase and cleanup the commits slightly.

@tresf tresf force-pushed the win-arm64 branch 4 times, most recently from 0e2295f to 9f0775b Compare November 17, 2020 19:29
@tresf tresf marked this pull request as ready for review November 17, 2020 19:30
@tresf
Copy link
Contributor Author

tresf commented Nov 17, 2020

@matthiasblaesing some questions:

  • Would you like me to bump jna.jar with this PR, or would you rather a project maintainer do that?
  • Would you like me to add any documentation to www/WindowsDevelopmentEnvironment.md?
  • I'm not well versed in how the callback code is handled in 004c975. Would you or @twall or another project maintainer be willing to help with the upstream patch? I can propose the change, but I don't feel comfortable enough with defending it as I do not fully understand the code.. Reading related commits mentioning same target using different compilers makes me even more uncomfortable (e.g. not limiting the patch to MSVC, etc).
  • Last, since libffi/libffi has -- and likely will continue to -- suffer from divergence issues, are there any further requests for handling that or would you prefer to tackle that another time?

I'm happy to submit an upstream patch for the msvcc.sh at any time.
Edit: Done via libffi/libffi#596

@matthiasblaesing
Copy link
Member

Would you like me to bump jna.jar with this PR, or would you rather a project maintainer do that?

I assume, that you mean dist/jna.jar. Then no, that file is updated when a release is done. The release process builds the java part from source and then bundles the binaries for the platforms, forming the final jna.jar, which is then also published to maven central.

Would you like me to add any documentation to www/WindowsDevelopmentEnvironment.md?

I use that file as my base to rebuild the native binaries. If I understood correctly, the build works in cross-platform mode and thus it would be great to add a description. At some point someone needs to rebuild the native parts and then these notes. It is best if the binary is rebuild on the target platform itself, but my experience is, that this is not always possible.

I'm not well versed in how the callback code is handled in 004c975. Would you or @twall or another project maintainer be willing to help with the upstream patch? I can propose the change, but I don't feel comfortable enough with defending it as I do not fully understand the code.. Reading related commits mentioning same target using different compilers makes me even more uncomfortable.

Sorry, I'm also more at home on the java side. I can fix obvious problems and spot problems, but here I'm out. It would indeed be great if you could help with upstreaming @twall. In general the current state looks sane to me and I'm inclined to move take it. I did a diff against upstream master and there is some movement in the ARM64/windows area, so it might be worth updating while its hot :-).

Last, since libffi/libffi has -- and likely will continue to -- suffer from divergence issues, are there any further requests for handling that or would you prefer to tackle that another time?

If people want to do it in the JNA repository, ok, but then the question needs to be answered how sustainable that is. If we have changes separated out, documented and covered with tests, that are run as part of the build, all is well. If not I think it is better to push changes to go through upstream.

I like the debian approach, where upstream is augemented with seperated patches, which can be reviewed/exchanged individually.


I tested this branch on windows-x86-64, windows-x86 and linux-x86-64. Build runs clean and unittests run clean (that means the unittests that fail on master also fail on this PR, no new failures are reported).

I have a few nitpicks for the commits:

bd672b9: It would be great if the version/commit of libffi could be mentioned that was used to create that update. That might be helpful if future developers try to do an update.

cef2d76 + 9f0775b: Please squash. It should not matter, as the win32-aarch64.jar only contains native code, so should be save anyway.

Please update your author information for the commits, to contain your fullname. The email part is already spells it out, but I'd like to keep the author information correct.

Please add an entry to CHANGES.md about the update of libffi and more importantly the added platform support.

@tresf
Copy link
Contributor Author

tresf commented Nov 21, 2020

It is best if the binary is rebuild on the target platform itself, but my experience is, that this is not always possible.

Since Microsoft doesn't offer a native ARM64 version of Visual Studio, the target platform uses near-identical build tools. The major difference is the x86 versus x86_64 toolchain.

Since..

  1. Microsoft will soon be adding x86_64 emulation support to the ARM64 version of Windows
  2. Visual Studio 2019 finally defaults to a 64-bit toolchain.
  3. Consistent with changes to AppVeyor
  4. The compiler and toolchain would then be identical between native and cross.

... then I'm inclined to keep the tutorial for 64-bit systems. Do you agree? If not, we get into some differentiating steps between cygwin versus cygwin64, I can spell these out if needed.

That said, the binary provided was built on an Intel machine. That's purely for convenience and I'm happy to re-upload using a build created with the 32-bit toolchain on native hardware. I just wanted to clarify that the native build still uses an x86 environment until Microsoft releases a version of Visual Studio for ARM64.

Edit: Done. (Rebuilt on ARM, re-uploaded, squashed).

Please update your author information for the commits, to contain your fullname.

Happy to, but where do you see this missing information?

Edit: Found it using git show <commit_id>. Had to reset and cherry-pick, using git commit --amend --reset-author --no-edit for each one. I think it's fixed.

In regards to libffi/upstream would jna consider having their own fork? I can help setup the submodule, but developers would need to do git clone ... --recursive each time, which might be frustrating, but at least there'd be a clear, historical divergence path and the PRs would likely originate from JNA proper, instead of specific developer branches. I've done this quite a bit and happy to help with this effort, if interested.

Regardless of that decision, in the interim, I'll reword the commit to include the specific versioning information and address the other requests as well.

@tresf
Copy link
Contributor Author

tresf commented Nov 21, 2020

Initial rewrite of the readme is here: https://gist.github.com/tresf/9182a81d7c41a9425ae8ad8f626773e1

Please let me know if this is in the right direction, or if you'd prefer to keep the <pre> style with one section per-arch. I've deliberately hidden the old instructions as your notes were far easier to follow, copying in the pertinent info where necessary.

@tresf tresf force-pushed the win-arm64 branch 2 times, most recently from d809496 to 8eac2a5 Compare November 22, 2020 01:11
@tresf
Copy link
Contributor Author

tresf commented Nov 22, 2020

I believe all request have been addressed with exception of an updated Windows procedure. When I have direction on that, I'm happy to commit or squash.

@VISTALL VISTALL mentioned this pull request Dec 12, 2020
20 tasks
@matthiasblaesing
Copy link
Member

Ok - I think the wait is long enough - maybe it is time to consider just cherry picking the fix into our tree and be done with it. @tresf what do you think?

@tresf
Copy link
Contributor Author

tresf commented Jan 7, 2021

Since the last commits the following ARM64 changes have landed publicly...

This gives me a unique opportunity to rebuild my system from scratch to match the Windows build tutorial (64-bit toolchain) but this introduces a minor, unforeseen issue with cygwin documented here: 2714a01. I've reported this issue to the cygwin developers.

Sorry for the TL;DR, but at time of writing this, I'm still encountering build failures with this setup. When they're resolved, I'll get back to bumping libffi with the aforementioned (unmerged) patch and move forward.

@tresf
Copy link
Contributor Author

tresf commented Jan 12, 2021

As it turns out, gcc-g++ is needed building "natively" on ARM64, as it toggles shared library support as well as offers other tools. This is likely a bug with the build system, but for sake of timeliness (and lack of knowledge), I'm amending the ARM64 table to include the appropriate gcc-g++ dependency.

The list of tools which are reported as missing: objdump, dlltool, ar, "@FILE" (feature), strip, ranlib, "parse dumpbin headers from msvcc.sh: (feature), "shared library support" (feature). These are reported as present when gcc-g++ is installed.

The symptom is unobvious at first, the compile begins as expected, but the libffi.lib has the wrong name and compilation fails (it's called ffi.lib instead of libffi.lib, the "lib" prefix is missing). I believe this is due to the "shared library support" missing from the compiler.

Anyway, I wanted to document this somewhere, as other projects relying on libffi and msvcc.sh may very well run into this down the road until the system is ported over to something that handles MSVC natively.

Edit: Done via 387920b.

tresf added a commit to tresf/jna that referenced this pull request Jan 12, 2021
@tresf
Copy link
Contributor Author

tresf commented Jan 12, 2021

Ok - I think the wait is long enough - maybe it is time to consider just cherry picking the fix into our tree and be done with it. @tresf what do you think?

Squashed, fast-forwarded, cherry-picked, ready for final review.

The multiple force-pushes were needed to ensure a clean clone can still build binaries between changes, and then a final force push to bring the resulting .jar back in.

As an aside, I grew tired of the .gitignore rules breaking the native/libffi build system again and again, so I wrote a small batch script to bump libffi automatically, bump_ffi.bat which fixes the .gitignore issues and automates the commit message (e.g. v3.3 + 64 commits)

@matthiasblaesing
Copy link
Member

I rebuild all native libraries with the new version, only skipping mac OS as that will be tackled with the alreay open PR, I also followed the notes about the windows build instructions and managed to rebuild all three windows native libraries. While doing this I slightly updated the documentation with my oberservation of the procedure.

Please have a look at this: https://github.com/matthiasblaesing/jna/tree/pr-1264. If you agree, I plan to merge that.

The native libraries can be found in the lib/native folder in the referenced commit. As an alternative here is a prebuild binary:

jna.zip

I plan to run the jna-platform tests with the updated base library, but I have yet to setup the necessary software.

@tresf
Copy link
Contributor Author

tresf commented Jan 18, 2021

Please have a look at this: https://github.com/matthiasblaesing/jna/tree/pr-1264

I have and although cosmetic changes with the documentation seem sane, the PATH changes don't work on my machines. I've commented directly on the commit in case you'd like to continue discussion there.

@matthiasblaesing
Copy link
Member

@tsref as discussed on the commit itself, I updated the branch with an updated suggestion for the wording of the windows instructions. The TL;DR version: I left you paths in place, but added a note that paths need to be updated depending on the exact software installed (I also removed the exact Java version, as long as JDK 8 is used, we are save).

I ran the platform tests on windows for 32 and 64bit. These came back clean, which is not surprising given, that appveyor is happy, in addition to the tests covered by appveyor on my machine (any machine that has Office installed) COM checks are run.

Please ensure, that the resulting binary can be run on aarch64, as I could only verify, that file reported the library as a PE file with arch aarch64. The binary can be found attached to the previous comment. (BTW: Is there an emulator available for Windows Aarch64?).

If you agree, I'll merge once Appveyor is happy.

@tresf
Copy link
Contributor Author

tresf commented Jan 18, 2021

(BTW: Is there an emulator available for Windows Aarch64?).

QEMU can run the full blown Windows ARM64 OS, but I've read it's tremendously slow. I also believe Microsoft offers a Windows 10X emulator, but it's been a very long time since I've fired it up and I'm unsure as to if the underlying OS is ARM64 or not. I'll see if I have one around still. It wasn't a very good OS for Desktop tasks last I tried (uses a weird Desktop sandbox) and the OS was too buggy for daily use, but that was about a year ago.

From what I can find most other emulators offered by Microsoft were old ARM32 layers, mostly for the phones. I'll see what I can dig up.

in addition to the tests covered by appveyor on my machine (any machine that has Office installed) COM checks are run.

Would you like me to try the same on ARM64?

@matthiasblaesing
Copy link
Member

(BTW: Is there an emulator available for Windows Aarch64?).

[QEMU Option]

Hehe, this reminds me about the pain when linux-mips64el and linux-ppc64el were build with full system emulation in Qemu.... Not fun :-)

From what I can find most other emulators offered by Microsoft were old ARM32 layers, mostly for the phones. I'll see what I can dig up.

in addition to the tests covered by appveyor on my machine (any machine that has Office installed) COM checks are run.

Would you like me to try the same on ARM64?

I have some fear here. My gut feeling is, that we could hit some hard assumptions, that don't hold anymore. Anyway - if can, it would be interesting, the final report would be great to see.

@tresf
Copy link
Contributor Author

tresf commented Jan 18, 2021

In regards to Microsoft Office ARM64 integration... Office seems to use predominantly x86 binaries still, so that didn't add any more unit tests the ARM64 tests. file reports PE32 executable (GUI) Intel 80386, for MS Windows (should be PE32+ executable (DLL) (GUI) Aarch64, for MS Windows). This is using the latest Office 365 from Microsoft's website.

In regards to ARM64 emulation.... Windows 10X was a bust. I'm unsure of the underlying architecture it's using, but what was very clear is that opening a command prompt on this OS isn't possible. The Desktop emulation wouldn't even start the 7zip installer.

Microsoft claims Hyper-V will do architecture emulation when needed and there's evidence of this, such as the work from Xamarin team to port Android with hardware acceleration over to Hyper-V, but at time of writing this, the .vhdx image from MIcrosoft isn't bootable on Intel, leaving QEMU as the most documented/viable option.

@matthiasblaesing matthiasblaesing merged commit 95ee1a3 into java-native-access:master Jan 19, 2021
@matthiasblaesing
Copy link
Member

matthiasblaesing commented Jan 19, 2021

I merged this in now - appveyor is happy, local tests with linux-x64, windows x86 and windows x64 came back clean. Thank you!

@tresf tresf deleted the win-arm64 branch January 19, 2021 20:07
dbwiddis pushed a commit to dbwiddis/jna that referenced this pull request Jan 29, 2021
@tresf
Copy link
Contributor Author

tresf commented Mar 24, 2021

Just an FYI on fast-forwarding/upstream status...

@tresf tresf mentioned this pull request Mar 29, 2022
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants