Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ansible: add RHEL 8 (initially s390x) #2859

Merged
merged 1 commit into from
Mar 1, 2022
Merged

Conversation

richardlau
Copy link
Member

@richardlau richardlau commented Jan 28, 2022

Extend Ansible and Jenkins scripts for Red Hat Enterprise Linux 8.

TODO:

  • ccache. We have scripts to build from source, but I'm waiting for a RHEL subscription (application in progress) which should let me install it as a package from EPEL.
  • python. Looks like V8 builds search for a python (unversioned filename) binary. Need to check if we can symlink to python3 or we'll actually need to install Python 2 😞.
  • Resolve EHOSTUNREACH errors. ansible: add RHEL 8 (initially s390x) #2859 (comment)
  • libuv CI passes
  • Node.js CI passes
  • V8 CI pases
  • Test release machine set up

@richardlau
Copy link
Member Author

Test build shows EHOSTUNREACH errors.
e.g.
https://ci.nodejs.org/job/richardlau-node-test-commit-linuxone/nodes=rhel8-s390x/3/console

19:12:33 not ok 1110 parallel/test-http-localaddress
19:12:33   ---
19:12:33   duration_ms: 1.268
19:12:33   severity: fail
19:12:33   exitcode: 1
19:12:33   stack: |-
19:12:33     (node:233962) internal/test/binding: These APIs are for internal testing only. Do not use them.
19:12:33     (Use `node --trace-warnings ...` to show where the warning was created)
19:12:33     node:events:500
19:12:33           throw er; // Unhandled 'error' event
19:12:33           ^
19:12:33     
19:12:33     Error: connect EHOSTUNREACH 127.0.0.1:37527
19:12:33         at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1157:16)
19:12:33     Emitted 'error' event on ClientRequest instance at:
19:12:33         at Socket.socketErrorListener (node:_http_client:442:9)
19:12:33         at Socket.emit (node:events:522:28)
19:12:33         at emitErrorNT (node:internal/streams/destroy:164:8)
19:12:33         at emitErrorCloseNT (node:internal/streams/destroy:129:3)
19:12:33         at processTicksAndRejections (node:internal/process/task_queues:83:21) {
19:12:33       errno: -113,
19:12:33       code: 'EHOSTUNREACH',
19:12:33       syscall: 'connect',
19:12:33       address: '127.0.0.1',
19:12:33       port: 37527
19:12:33     }
19:12:33     
19:12:33     Node.js v18.0.0-pre
19:12:33   ...

@richardlau
Copy link
Member Author

richardlau commented Jan 28, 2022

I think the EHOSTUNREACH require iptables-services and the additional rules that we enabled for RHEL 7. Applied those and re-testing: https://ci.nodejs.org/job/richardlau-node-test-commit-linuxone/4/nodes=rhel8-s390x/console

@richardlau
Copy link
Member Author

I think the EHOSTUNREACH require iptables-services and the additional rules that we enabled for RHEL 7. Applied those and re-testing: https://ci.nodejs.org/job/richardlau-node-test-commit-linuxone/4/nodes=rhel8-s390x/console

Passed 🎉. Will tackle the V8 build next and also check libuv builds.

@richardlau
Copy link
Member Author

V8 builds require Python 2 still 😞. I've added that to the instance I'm setting up and the V8 build starts but the Jenkins agent is being disconnected/killed during the build. I think this is a memory issue -- it looks like we have half the memory in the self-provisioned instances (4Gb) compared to the existing RHEL7 instances (8Gb). I'm looking at adding 4Gb of swap to the new RHEL 8 instance to see if that can get the V8 builds completing.

@richardlau
Copy link
Member Author

Didn't need swap in the end as there's another "flavor" of machine we can pick in the self provisioning interface that has twice the CPUs and memory and is similar in spec to the existing RHEL7 machines. I did have working changes to enable swap but I've put those aside for a future separate PR for use on other platforms.

I'm currently setting up the RHEL8 LinuxONE release machine. As part of that I've tackled the long-standing TODO of converting to Ansible the currently manual steps to configure ssh on the release machines to be able to upload to the staging server (see the new "release-builder" role). Need to clone the iojs+release job and test the new machine can build a test release build.

I've also been able to make the Ansible scripts for the new RHEL 8 machine idempotent -- Ansible showing no changes if run a second or more time.

Extend Ansible and Jenkins scripts for Red Hat Enterprise Linux 8.
Also add new `release-builder` role, for setting up ssh config and
keys to upload to the staging server, and changes to make the
playbook idempotent.
@richardlau richardlau changed the title [WIP] ansible: add RHEL 8 ansible: add RHEL 8 Feb 24, 2022
@richardlau richardlau marked this pull request as ready for review February 24, 2022 21:55
@richardlau
Copy link
Member Author

This is now ready for review. It adds the first set of RHEL 8 machines (s390x aka LinuxONE). I'll add the other architectures (ppc64le, arm64, x64) in follow up pull requests -- I'd expect the changes to be less for those as this PR does much of the common work for RHEL 8. Please do ask any questions if any of the Ansible changes are unclear.

CI jobs will need to be updated -- I'll handle that when landing this as the job updates will need to be coincide with the updates to the VersionSelectorScript.groovy and select-compiler.sh scripts.

cc @nodejs/build

@richardlau richardlau changed the title ansible: add RHEL 8 ansible: add RHEL 8 (initially s390x) Feb 24, 2022
@sxa sxa self-assigned this Feb 25, 2022
Release machines must be able to upload release artifacts to the nodejs.org
web server. The [release-builder](roles/release-builder) Ansible role will
write the necessary key and ssh config onto the release machine, automating
the previously manual steps.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to say it is automated, but the instructions lower down still say you need to copy over, those should probably say that for some platforms ansible may have already done this for you?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debating whether to delete the manual steps but thought it would be useful to keep as a reference. I could rename the "Manual steps" twisty to "Previously used manual steps", or add a sentence saying "The following manual steps are now automated by the Ansible role and included for reference only."?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open a follow up to clarify the wording.

Copy link
Member

@mhdawson mhdawson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@richardlau richardlau merged commit 7c82a22 into nodejs:master Mar 1, 2022
@richardlau richardlau deleted the rhel8 branch March 1, 2022 12:33
@richardlau
Copy link
Member Author

I've just spotted that @sxa had self-assigned this. @sxa I'm assuming you did this with the intention of reviewing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants