Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two jdk8 security tests consistently failed on test-osuosl-centos74-ppc64le-1 #2625

Closed
sophia-guo opened this issue Jun 21, 2022 · 26 comments
Closed

Comments

@sophia-guo
Copy link

Recent builds run shows that JDK8 security tests:
sun/tools/jinfo/Basic.sh
sun/security/pkcs11/fips/TestTLS12.java

consistently failed on test-osuosl-centos74-ppc64le-1.

Any other details:
Grinder on test-osuosl-ubuntu1804-ppc64le-1: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5017/, https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5016

Grinder on test-osuosl-centos74-ppc64le-1: https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5015, https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5014

@Haroon-Khel
Copy link
Contributor

Rerunning on test-osuosl-centos74-ppc64le-1

sun/security/pkcs11/fips/TestTLS12.java
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5092/console

sun/tools/jinfo/Basic.sh
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5094/console

@Haroon-Khel
Copy link
Contributor

Both tests also fail on test-osuosl-centos74-ppc64le-2

@Haroon-Khel
Copy link
Contributor

Haroon-Khel commented Jul 12, 2022

For the sun/tools/jinfo/Basic.sh failure, the following line causes the test to fail
https://github.com/adoptium/jdk8u/blob/a18e9043fa2a0a14098e1ec25d32577aaac6c023/jdk/test/sun/tools/jinfo/Basic.sh#L64

Edit: The test failure is caused by that if statement block (lines 62-75) in the linked file.

Using jinfo with either the -sysprops, -flags options, or no option, will cause the error. Meanwhile using the -flag option (the later tests in the linked file) will not cause the error

@Haroon-Khel
Copy link
Contributor

Testing this in a command line environment on a failing machine, test-osuosl-centos74-ppc64le-1:

[jenkins@test-centos74-1 bin]$ ./jinfo -J-XX:+UsePerfData -flags +PrintGC 21443
Usage:
    jinfo <option> <pid>
       (to connect to a running process)

where <option> is one of:
    -flag <name>         to print the value of the named VM flag
    -flag [+|-]<name>    to enable or disable the named VM flag
    -flag <name>=<value> to set the named VM flag to the given value
    -h | -help           to print this help message
[jenkins@test-centos74-1 bin]$ ./jinfo -J-XX:+UsePerfData -sysprops 21443
Usage:
    jinfo <option> <pid>
       (to connect to a running process)

where <option> is one of:
    -flag <name>         to print the value of the named VM flag
    -flag [+|-]<name>    to enable or disable the named VM flag
    -flag <name>=<value> to set the named VM flag to the given value
    -h | -help           to print this help message
[jenkins@test-centos74-1 bin]$ ./jinfo -J-XX:+UsePerfData 21443
Usage:
    jinfo <option> <pid>
       (to connect to a running process)

where <option> is one of:
    -flag <name>         to print the value of the named VM flag
    -flag [+|-]<name>    to enable or disable the named VM flag
    -flag <name>=<value> to set the named VM flag to the given value
    -h | -help           to print this help message

While If I use one of the working options:

[jenkins@test-centos74-1 bin]$ ./jinfo -J-XX:+UsePerfData -flag -PrintGC 21443
[jenkins@test-centos74-1 bin]$ echo $?
0

@Haroon-Khel
Copy link
Contributor

I think I've got it. These failing tests are not supposed to run on this machine. For example, on a passing machine, test-skytap-ubuntu2004-ppc64le-1, the if statement block which runs these failing tests does not run, https://github.com/adoptium/jdk8u/blob/a18e9043fa2a0a14098e1ec25d32577aaac6c023/jdk/test/sun/tools/jinfo/Basic.sh#L62, since the $runSA = true condition is false.

On test-skytap-ubuntu2004-ppc64le-1, the command ptrace_scope=/sbin/sysctl -n kernel.yama.ptrace_scope, from https://github.com/adoptium/jdk8u/blob/a18e9043fa2a0a14098e1ec25d32577aaac6c023/jdk/test/sun/tools/jinfo/Basic.sh#L54 returns 1 while on a failing machine, test-osuosl-centos74-ppc64le-1, it returns 0

@Haroon-Khel
Copy link
Contributor

Haroon-Khel commented Jul 12, 2022

Setting kernel.yama.ptrace_scope to 1 has fixed this

[jenkins@test-centos74-1 bin]$ /sbin/sysctl -n kernel.yama.ptrace_scope
1
[jenkins@test-centos74-1 bin]$ ./java -jar ../../jtreg/lib/jtreg.jar ../../jdk8u/jdk/test/sun/tools/jinfo/Basic.sh
Test results: passed: 1
Report written to /home/jenkins/jdk8u332-b09/bin/JTreport/html/report.html
Results written to /home/jenkins/jdk8u332-b09/bin/JTwork

@Haroon-Khel
Copy link
Contributor

@Haroon-Khel
Copy link
Contributor

Ref sun/security/pkcs11/fips/TestTLS12.java

On a passing machine, test-skytap-ubuntu2004-ppc64le-1, this test is skipped due to
Test skipped: TLS 1.2 mechanisms not supported by current SunPKCS11 back-end

Need to find why it isnt skipped on the failing machine, or whether it should be skipped

@Haroon-Khel
Copy link
Contributor

On a passing machine, the value for sunPKCS11NSSProvider is null, causing the test to skip. While on our 2 failing machines, this value is SunPKCS11-NSSKeyStore version 1.8

@Haroon-Khel
Copy link
Contributor

Of the passing machines in #2625 (comment), the test skips

@Haroon-Khel
Copy link
Contributor

Haroon-Khel commented Jul 21, 2022

The test skips on passing machine because https://github.com/adoptium/jdk8u/blob/3dca446d440e55cbb7dc3555392f4520ec9ff3bc/jdk/test/sun/security/pkcs11/SecmodTest.java#L45 returns false due to not being able to find libsoftokn3.so, https://github.com/adoptium/jdk8u/blob/3dca446d440e55cbb7dc3555392f4520ec9ff3bc/jdk/test/sun/security/pkcs11/PKCS11Test.java#L95, while the failing centos machines have this library installed, and so the test will run on these machines and subsequently fail.
However, on test-skytap-ubuntu2004-ppc64le-1 I installed an nsssoftoken which did provide the missing library into /usr/lib, yet the test still could not find the library and skipped the test

@sxa sxa added this to the 2022-10 (October) milestone Oct 5, 2022
@sxa
Copy link
Member

sxa commented Oct 5, 2022

Bumping this to October milestone as it's a dependency of #2662 which is in there too

@sophia-guo sophia-guo assigned sophia-guo and unassigned sophia-guo Apr 17, 2023
@sxa sxa modified the milestones: 2023-04 (April), 2023-06 (June) Jun 6, 2023
@steelhead31 steelhead31 self-assigned this Jun 27, 2023
@steelhead31 steelhead31 moved this from Todo to In Progress in Adoptium 2Q 2023 Plan Jun 27, 2023
@steelhead31
Copy link
Contributor

I believe the root cause of this is that the version of SunPKCS11 being used on Centos 7.4 does not support TLS1.2, and as such the test should be skipped as per Ubuntu, not sure what a fix might be, but I'll keep digging..

@steelhead31
Copy link
Contributor

In order to get the test to pass on Centos7.4, this parameter is required -Djdk.tls.ephemeralDHKeySize=2048

@sxa
Copy link
Member

sxa commented Jun 29, 2023

Is this specific to ppc64le? That would seem quite strange if true. Also does the same fix allow the test to pass on Ubuntu systems?

@steelhead31
Copy link
Contributor

The fix is specific to centos, on Ubuntu & Fedora, the test gets skipped... 

I've had this feedback from the OpenJDK team..

I think the permanent fix should go on the RHEL 9 build of OpenJDK, where
we need to align with RH1974274 "java.txt file contains both security
properties and system properties", now that System properties are in
the /etc/crypto-policies/back-ends/javasystem.config file.

It seems Andrew Hughes is aware of this, by the conversations from the
original RH1883312 ticket and the fedora-crypto-policies MR #104.

However, this is likely to have low priority in a tight backlog. Also,
we need to re-think this in the context of the removal/reduction of
downstream RHEL patches in OpenJDK packages with the goal shipping
near-binary-identical builds to Temurin ones (OPENJDK-1199).

RH1974274: https://bugzilla.redhat.com/show_bug.cgi?id=1974274
RH1883312: https://bugzilla.redhat.com/show_bug.cgi?id=1883312
MR #104: https://gitlab.com/redhat-crypto/fedora-crypto-policies/-/merge_requests/104
OPENJDK-1199: https://issues.redhat.com/browse/OPENJDK-1199

@sxa
Copy link
Member

sxa commented Jun 29, 2023

The fix is specific to centos, on Ubuntu & Fedora, the test gets skipped...

Is that in aqa-tests or is it skipped in the openjdk codebase on those distros? Scratch that - we can't restrict by distro yet in aqa-tests AFAIK. Still a bit unsure why we're only hitting it on ppc64le (I think we have comparable distributions on other architectures...)

@steelhead31
Copy link
Contributor

Yes, given that its skipped on Ubuntu & Fedora in aqa-tests,  is excluding the test on Linux/ppc64le an appropriate solution ?

@smlambert
Copy link
Contributor

I think it is appropriate to exclude it ProblemList_openjdk8.txt with reference to https://bugs.openjdk.org/browse/JDK-8029661 (or whichever bug notes that there is not an intention to fix in JDK8).

If only seen on ppc64le, is it because our CentOS nodes for x64 Linux are different version of CentOS or have different configuration?

@steelhead31
Copy link
Contributor

After further discussions with @smlambert , and applying the required option would apply it too liberally to a number of tests, the decision has been made to exclude this test, so as to have the minimum impact. as this is deemed a low priority to fix upstream.

@sxa
Copy link
Member

sxa commented Jun 29, 2023

If only seen on ppc64le, is it because our CentOS nodes for x64 Linux are different version of CentOS or have different configuration?

I've been wondering this too, although ppc64le is the only platform where we have active CentOS7 test systems as far as I can tell (https://ci.adoptium.net/manage/computer/test%2Daws%2Drhel76%2Darmv8%2D1/ exists but is currently not labelled for test - perhaps we could force a Grinder to it anyway to verify?)

We have a few CentOS/RHEL8 (e.g. https://ci.adoptium.net/manage/computer/test%2Ddocker%2Dcentos8%2Darmv8%2D1/) if we wanted to compare 7 vs 8 directly.

@steelhead31
Copy link
Contributor

Tested on Centos 8 ( x64 & Arm64 ) without error, am just looking into whether it skips.

@sxa
Copy link
Member

sxa commented Jun 29, 2023

Suggest doing a test on the AWS aarch64 RHEL7 one too if possible.

@steelhead31
Copy link
Contributor

@sxa It has run ok on the AWS rhel 7.6 machine, however I've managed to replicate the failure ( and the fix ) in Grinder on the AWS rhel 8 (x64) machine , Grinders 7316 & 7317...

@steelhead31
Copy link
Contributor

In lieu of the upstream bug, this test has now been excluded in aqa PR ( adoptium/aqa-tests#4652 )

Excluding this test from JDK8. ( see https://bugs.openjdk.org/browse/JDK-8029661 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

No branches or pull requests

5 participants