-
-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many network tests failing on AIX in JDK20 (with UnknownHostException: Hostname and service name not provided) #3178
Comments
Transferring this issue to infrastructure repository. It continues to be an issue seen on certain machines, including test-osuosl-aix72-ppc64-5, as seen in https://ci.adoptium.net/job/Test_openjdk21_hs_extended.openjdk_ppc64_aix_testList_2/7/
|
Related: #3030 |
All hosts - or some? As most systems are a clone of I can look for differences - but a known working server compared with a known failing server works is the preferred starting point. |
I believe the issue is observed on test-osuosl-aix72-ppc64-4, test-osuosl-aix72-ppc64-5, test-osuosl-aix72-ppc64-6. Though I have exhaustively looked at all hosts, anecdotally -1, -2, -3 look like they do not have this issue. |
To reproduce this issue, use this Rerun in Grinder link and set LABEL to be the hostname to run on, for example set LABEL to test-osuosl-aix72-ppc64-6. |
It looks like -3 and -4 were removed from the ansible inventory file back in May, so perhaps these machines can be ignored if they are pending deletion. I mention the pending deletion of these machines here. |
Here are the current records:
All systems are build and active. After the cloning the playbook, afaik, was rerun over all the all the aix72 systems - and obviously, the aix73 system was built/configured using the playbook. Any differences are because someone has made changes manually. No control and/or change history on manual changes. |
There are no aix71 system remaining - those are the systems that were removed in May. |
Oh, ok. Odd that the removed machines had "aix72" in their names. |
Ok, got my facts straight now. test-osuosl-aix72-ppc64-3 and -4 were removed from the inventory file in may, but they were later replaced by other machines that now use the same names as the ones that were removed. So the current test-osuosl-aix72-ppc64-3 and -4 are not pending deletion from jenkins. |
Correct on both points.
***@***.***>
Michael Felt
Mobile +31 (0)6 5184 4181
Email ***@***.***
From: Adam Farley ***@***.***>
Sent: Wednesday, September 6, 2023 1:46 PM
To: adoptium/infrastructure ***@***.***>
Cc: Michael Felt ***@***.***>; Comment ***@***.***>
Subject: Re: [adoptium/infrastructure] Many network tests failing on AIX in JDK20 (with UnknownHostException: Hostname and service name not provided) (Issue #3178)
Ok, got my facts straight now.
test-osuosl-aix72-ppc64-3 and -4 were removed from the inventory file in may <#3053> , but they were later <e9c5559> replaced by other machines that now use names as the ones that were removed.
So the current test-osuosl-aix72-ppc64-3 and -4 are not pending deletion from jenkins.
—
Reply to this email directly, view it on GitHub <#3178 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACSZR5M7IWNFSINGGKPFOKTXZBPATANCNFSM6AAAAAA4LU5B4E> .
You are receiving this because you commented. <https://github.com/notifications/beacon/ACSZR5O26T3XANYNTZUUL4DXZBPATA5CNFSM6AAAAAA4LU5B4GWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTF2DORY.gif> Message ID: ***@***.*** ***@***.***> >
|
Update: This issue still appears to occur. Example.
Seen on test-osuosl-aix72-ppc64-3. |
NOTE:
A number of the errors seem likely to be caused by the hostname not be resolvable i.e. |
From Deep History: Machines with the Hostname and service name not provided issue: Machines that do not have that issue: |
Looking at one of them -
|
Noting that the /etc/hosts will get replaced by the regular refreshes from AWX. I've kicked off the following after reinstating the originally deployed line in
New test jobs (If Full sanity=N it means I'm just running the same three targets as earlier):
[1] - Netowkr tests were all good so the problem in this issue is resolved, however these two runs had a failure in java/lang/Thread:
|
Noting that the AWX deploy is overwriting my new line, so for now I've added |
Running 100 iteration with the failing thread test (java/lang/ThreadLocal/TestThreadId.java.TestThreadId) on two machines:
And ten instances of
Suggests it's a load issue of some sort when running the whole suite, although the tests are using concurrency:1 but this probably needs tobe a separate issue as the original problem described in this issue is now resolved (although needs an improved playbook fix since I' vemanually patched /etc/hosts) |
Intermittent |
As a follow-up to the proposals above, noting that for other operating systems:
On this basis it is likely that making a similar change to the UNIX playbook on the AIX machines is the preferred option here, however given the proximity to the January release cycle I suggest we pause anything more for now (although comments/discussion on the options are still welcome) We could also use |
Closing as 3344 has been split out to cover a permanent solkution going forward so we don't need to block the |
sun/security/krb5/auto/NoAddresses.java appears to fail across all available AIX machines:
Test Info
Test Name: jdk_security4_1
Test Duration: 13 min 8 sec
Machine: test-osuosl-aix72-ppc64-6
TRSS link for the test output: https://trss.adoptium.net/output/test?id=64b3f9a817052c671580a5fe
Build Info
Build Name: Test_openjdk20_hs_sanity.openjdk_ppc64_aix
Jenkins Build start time: Jul 15 2023, 09:08 pm
Jenkins Build URL: https://ci.adoptium.net/job/Test_openjdk20_hs_sanity.openjdk_ppc64_aix/110/
TRSS link for the build: https://trss.adoptium.net/allTestsInfo?buildId=64b3f8a617052c671580a04d
Java Version
openjdk version "20.0.1-beta" 2023-04-18
OpenJDK Runtime Environment Temurin-20.0.1+9-202307152344 (build 20.0.1-beta+9-202307152344)
OpenJDK 64-Bit Server VM Temurin-20.0.1+9-202307152344 (build 20.0.1-beta+9-202307152344, mixed mode)
This test has been failed 17 times since Jun 10 2023, 08:34 pm
Java Version when the issue first seen
openjdk version "20.0.1" 2023-04-18
OpenJDK Runtime Environment Temurin-20.0.1+9 (build 20.0.1+9)
OpenJDK 64-Bit Server VM Temurin-20.0.1+9 (build 20.0.1+9, mixed mode)
Jenkins Build URL: https://ci.adoptium.net/job/Test_openjdk20_hs_sanity.openjdk_ppc64_aix/94/
The test failed on machine test-osuosl-aix72-ppc64-6 2 times
The test failed on machine test-osuosl-aix72-ppc64-3 3 times
The test failed on machine test-osuosl-aix72-ppc64-5 4 times
The test failed on machine test-osuosl-aix72-ppc64-4 2 times
The test failed on machine test-osuosl-aix72-ppc64-1 1 times
The test failed on machine test-osuosl-aix72-ppc64-2 4 times
The test failed on machine build-osuosl-aix72-ppc64-2 1 times
Rerun in Grinder
Other network related targets also failing on AIX with same issue, examples:
jdk_nio from https://ci.adoptium.net/job/Test_openjdk20_hs_extended.openjdk_ppc64_aix_testList_2/6/
The text was updated successfully, but these errors were encountered: