Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jdk_lang_native_win_0 FAILED in Test_openjdk11_hs_sanity.openjdk_x86-32_windows #2029

Closed
smlambert opened this issue Oct 23, 2020 · 13 comments
Closed
Assignees
Labels
triage required Issue needs deeper triage to determine which repo to move issue into
Milestone

Comments

@smlambert
Copy link
Contributor

Test Info
Test Name: jdk_lang_native_win_0
Test Duration: 23 sec
Machine: test-aws-win2019-x64-2
TRSS link for the test output: https://trss.adoptopenjdk.net/output/test?id=5f92d7e511d26031c130be5a

Build Info
Build Name: Test_openjdk11_hs_sanity.openjdk_x86-32_windows
Jenkins Build start time: Oct 23 2020, 08:08 am
Jenkins Build URL: https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-32_windows/363/
TRSS link for the build: https://trss.adoptopenjdk.net/allTestsInfo?buildId=5f92d75f11d26031c130bbf0

Java Version
openjdk version "11.0.9" 2020-10-20
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.9+11)
OpenJDK Client VM AdoptOpenJDK (build 11.0.9+11, mixed mode)

This test has been failed 14 times since Oct 10 2020, 08:09 pm
Java Version when the issue first seen
openjdk version "11.0.9" 2020-10-20
OpenJDK Runtime Environment Hotspot (build 11.0.9+10-202010102343)
OpenJDK Client VM Hotspot (build 11.0.9+10-202010102343, mixed mode)
Jenkins Build URL: https://ci.adoptopenjdk.net/job/Test_openjdk11_hs_sanity.openjdk_x86-32_windows/350/

The test failed on machine test-aws-win2019-x64-2 3 times
The test failed on machine test-azure-win2012r2-x64-1 3 times
The test failed on machine test-aws-win2019-x64-1 3 times
The test failed on machine test-ibmcloud-win2012r2-x64-1 1 times
The test failed on machine test-godaddy-win2016-x64-1 2 times
The test failed on machine test-godaddy-win2016-x64-4 1 times
The test failed on machine test-godaddy-win2016-x64-2 1 times

Rerun in Grinder

@smlambert smlambert added the triage required Issue needs deeper triage to determine which repo to move issue into label Oct 23, 2020
@smlambert
Copy link
Contributor Author

The failing test case within jdk_lang_native_win is java/lang/ProcessBuilder/checkHandles/CheckHandles.java

java.lang.UnsatisfiedLinkError: 'long CheckHandles.getProcessHandleCount()'
at CheckHandles.getProcessHandleCount(Native Method)
at CheckHandles.main(CheckHandles.java:64)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
at java.base/java.lang.Thread.run(Thread.java:834)

Appears to have been failing for at least 75 test runs / days.

This testcase was not part of sanity.openjdk (tier1 tests) looking at a run from July 30th where it is not present, though it appears to have been added to openjdk repo in March.

Unclear if it has ever run and passed on win32.

@adamfarley
Copy link
Contributor

This issue appears to be a duplicate of #1773.

Also, #1773 created PR #1774 to exclude this test, and the test still appears to be excluded (twice, oddly enough), so I'm not sure why we're even running it.

java/lang/ProcessBuilder/checkHandles/CheckHandles.java https://github.com/AdoptOpenJDK/openjdk-tests/issues/1920 windows-all
java/lang/ProcessBuilder/checkHandles/CheckHandles.java https://github.com/AdoptOpenJDK/openjdk-tests/issues/1773 windows-x86

https://github.com/AdoptOpenJDK/openjdk-tests/blob/master/openjdk/ProblemList_openjdk11.txt

@adamfarley
Copy link
Contributor

adamfarley commented Jan 27, 2021

The exclusion-list failure looks to be explained in #1920

That PR seems to say that tests excluded from other targets (jdk_lang and jdk_net in this case), are not excluded from native target runs. So that explains the exclusion oddity.

@adamfarley
Copy link
Contributor

adamfarley commented Jan 27, 2021

Heya @sophia-guo

It looks like you did some work with CheckHandles.java back in July 2020. Link here.

Can you recall if this test ever passed on 32bit Windows?

@adamfarley
Copy link
Contributor

Also, here's a timeline of what I think has happened:

  • March 2020: Test introduced.
  • May 2020: Morgan sees a link issue in CheckHandles.java, as it's being run in a non-native jdk_lang_0 target. Link
  • July 2020: Sophia spots the underlying issue and specifies CheckHandles.java as something that should be run in a seperate, native target. Link.
  • July 2020: Sophia moves CheckHandles.java into a native target, and prevents it running anywhere else. Link.
  • October 2020: Shelley spots that the native target run of CheckHandles.java continues to see the link issue. Raises this issue.
  • October 2020: Shelley points this problem out during the October release. Link.
  • January 2021: Adam starts doing triage again, and sees that the issue is still happening.

@sophia-guo
Copy link
Contributor

#1920 set jdk_lang_native_win as a separate testcase with only one test java/lang/ProcessBuilder/checkHandles/CheckHandles.java to fix the native path issue. It doesn't use exclude problem list story. That's why test is still running on win32 with excluded in problemlist.

I didn't remember any issue with win32 as the problem hit by me was on github runner with windows64.

Looks like the win 32 issue has nothing to do with the native path issue as the failure is same before and after native path is added.

@adamfarley
Copy link
Contributor

@smlambert - Could you please launch a grinder test at Eclipse OpenJ9's CI, so I can compare openj9Settings.mk behaviour in their passing test compared to the output in our failing test?

If this could solve our problem, I first need to know why it's not preventing the failure for Adopt when it seems to be working at Eclipse/openj9.

Parameters:

  • ADOPTOPENJDK_REPO: https://github.com/adamfarley/openjdk-tests.git
  • ADOPTOPENJDK_BRANCH: openj9_settings_debug
  • PLATFORM: x86-64_mac
  • BUILD_LIST: functional
  • TARGET: cmdLineTest_sigabrtHandlingTest_0
  • SDK_RESOURCE: nightly
  • JDK_VERSION: 8
  • JDK_IMPL: openj9

Additionally, here's the output from that parameter set when run on Adopt's CI.

18:01:32  JCL_VERSION is set to 
18:01:32  JAVA_SHARED_LIBRARIES_DIR is set to /Users/jenkins/workspace/Grinder/openjdkbinary/j2sdk-image/Contents/Home/bin/..//../native-test-libs/
18:01:32  VM_SUBDIR is set to compressedrefs
18:01:32  ADD_JVM_LIB_DIR_TO_LIBPATH is set to export DYLD_LIBRARY_PATH="/Users/jenkins/workspace/Grinder/openjdkbinary/j2sdk-image/Contents/Home/bin/..//../native-test-libs/:

@smlambert
Copy link
Contributor Author

Grinder started: https://ci.eclipse.org/openj9/job/Grinder/1482

@adamfarley
Copy link
Contributor

My mistake: the grinder I asked for above was meant for #2178. Will copy the details over.

@karianna karianna added this to the February 2021 milestone Feb 5, 2021
@smlambert smlambert removed this from the February 2021 milestone Mar 1, 2021
@adamfarley
Copy link
Contributor

adamfarley commented Mar 1, 2021

FYI: Currently poking about locally, to figure out what's happening here. Will post information when I have any.

Update: Latest build runs to create natives that contain debug messages, and that have the WIN32 code removed from the libFileUtils.c file, to prove it's only being included on 32bit windows (which is what the ifdef _WIN32 seems to imply).

https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk16/job/jdk16-windows-x86-32-hotspot/60/console
https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk16/job/jdk16-windows-x64-hotspot/61/console

Also, here are two builds with the win32 check removed, but the code isn't. In case WIN32 isn't being set properly.

https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk16/job/jdk16-windows-x86-32-hotspot/62/
https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk16/job/jdk16-windows-x64-hotspot/62/

Results

Platform Normal (with WIN32 check) No libFileUtils.c contents Always libFileUtils.c contents (no WIN32 check)
Windows 32bit Fail with ULE Fail with ULE Fail with ULE
Windows 64bit Pass Fail with ULE Pass

Current working theory: WIN32 is supposed to be set for 32 and 64 bit windows, but is not set on 32bit, at least as far as this particular C file is concerned. We will know for sure (ish) once the last pair of windows builds is built.

Edit 2: Theory is wrong. Even after removing the WIN32 check, we can't find the native implementation of that method on 32bit. We can find the file, for sure, and we are loading it. Removing the file causes a different problem, and we have debug output that confirms that FileUtils.dll is completely loaded prior to us trying to use the native function inside of it. Which we then can't find.

@adamfarley
Copy link
Contributor

adamfarley commented Mar 9, 2021

Using the Visual Studio "dumpbin" tool to list the exports for the 64bit and 32bit dll files.

64bit:

    ordinal hint RVA      name
          1    0 00001000 Java_jdk_test_lib_util_FileUtils_getWinProcessHandleCount

32bit:

    ordinal hint RVA      name
          1    0 00001000 _Java_jdk_test_lib_util_FileUtils_getWinProcessHandleCount@4

The difference seems to be that the exported function in the 32bit dll has an underscore at the start, and an "@4" at the end.

Currently Googling to see what that means, and whether it's responsible for us being unable to find the exported function.

EDIT: Looks like it could be this. Investigating.

https://stackoverflow.com/questions/5156055/jna-the-specified-procedure-could-not-be-found

@adamfarley
Copy link
Contributor

adamfarley commented Mar 16, 2021

Back to basics.

This test was created for JDK15, and back ported as far as jdk11. There is no JDK8u version because the issue that spawned this test was confirmed not to occur on JDK8u or 9.

Since the last version of OpenJDK that was supported on Windows 32bit is JDK8u, this seems a solid argument for excluding the test. This test was never intended to run on JDK8u, and the issue it is checking for was confirmed to never occur on 8u.

If anyone wants to investigate this in the future, I think this test isn't working because the exported method in the dll file undergoes name mangling (See above comment, and also this wiki page on name mangling). So while the dll file itself can be found, the exported method is not.

Since it seems that openjdk's copy of JNA (e.g. Kernal32.java) doesn't include StdCallLibrary, my theory is that we're not correctly handling the name mangling, and that is causing this problem.

Proposed long-term solution: exclude the test on 32-bit windows, as it was never intended to run there, and may never have passed there (see timeline).

PR: #2388

@adamfarley
Copy link
Contributor

Merged. Closing this issue.

@karianna karianna added this to the March 2021 milestone Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage required Issue needs deeper triage to determine which repo to move issue into
Projects
None yet
Development

No branches or pull requests

4 participants