Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up new Equinix/WorksOnARM servers #2078

Closed
sxa opened this issue Mar 24, 2021 · 14 comments
Closed

Set up new Equinix/WorksOnARM servers #2078

sxa opened this issue Mar 24, 2021 · 14 comments

Comments

@sxa
Copy link
Member

sxa commented Mar 24, 2021

I need to request a new machine:

  • New machine operating system One CentOS, one Ubuntu
  • New machine architecture: aarch64
  • Provider (leave blank if it does not matter): Equinix
  • Desired usage: test and dockerBuild
  • Any unusual specification/setup required: Machines are already resetved so just need to be installed and set up
  • How many of them are required: 3 (Although I'll start with 2 :-) ) 160 core (2 socket) 3GHz Ampere Altra

Please explain what this machine is needed for: Replacement of previous ThunderX and D05 servers.

NOTE: There is a current issue with Ubuntu 20.04 systems that may cause reboot failures. To avoid the problem, run the following sequence of commands (taken from here) after deploying the OS

$ apt-get update
$ apt-get install grub2-common
$ grub-install --bootloader-id=ubuntu
Installing for arm64-efi platform.
Installation finished. No error reported.
@sxa
Copy link
Member Author

sxa commented Mar 25, 2021

New servers can also run the dragonwell builds which use -march=armv8.2-a+crypto so this is a good reason to move to these ASAP

@sxa
Copy link
Member Author

sxa commented Mar 25, 2021

Current status: Testing with a CentOS8 system and an Ubuntu 20.04 configured with several build&&dockerBuild executors for build and one Ubuntu 20.04 for test while I verify it (a) works and (b) has adequate performance I come to a conclusion about how to finally split it up :-)
sanity.openjdk
extended.openjdk

@sxa
Copy link
Member Author

sxa commented Mar 31, 2021

Configured the Ubuntu 20.04 system with multiple docker images - all 8 core, 8GB (Because, why not!)

Ports 2223, 2224, and 2225 are currently being used by RISC-V qemu VMs for testing purposes, so the port numbers above are not the same as on other docker host systems for now.

CONTAINER ID        IMAGE              COMMAND               CREATED              STATUS              PORTS                  NAMES
d479dc5604bf        aqa_f33            "/usr/sbin/sshd -D"   2 seconds ago        Up 2 seconds        0.0.0.0:2227->22/tcp   f33_2227
e3ec9e799892        aqa_u2004          "/usr/sbin/sshd -D"   39 seconds ago       Up 38 seconds       0.0.0.0:2232->22/tcp   U2004_2232
3c5a5f4507cc        aqa_u2010          "/usr/sbin/sshd -D"   57 seconds ago       Up 57 seconds       0.0.0.0:2231->22/tcp   U2010_2231
a96fd12a60b1        aqa_u1804          "/usr/sbin/sshd -D"   About a minute ago   Up About a minute   0.0.0.0:2230->22/tcp   U1804_2230
4327bf0f2675        aqa_u2004          "/usr/sbin/sshd -D"   3 minutes ago        Up 3 minutes        0.0.0.0:2226->22/tcp   U2004_2226
ec8eb9ece268        aqa_u1604          "/usr/sbin/sshd -D"   9 minutes ago        Up 9 minutes        0.0.0.0:2222->22/tcp   U1604_2222

@vielmetti
Copy link

Very interested in how this goes - let me know anything I can do to help, either directly or by introductions to folks at Ampere.

@sxa
Copy link
Member Author

sxa commented Jun 9, 2021

Deleting the following static docker jenkins agents hosted on ThunderX system 74.50

  • test-docker-centos8-armv8-1
  • test-docker-fedora33-armv8-1
  • test-docker-ubuntu1604-armv8-1
  • test-docker-ubuntu1804-armv8-1
  • test-docker-ubuntu2004-armv8-1
  • test-packet-ubuntu1604-armv8-1

Also on the ThunderX .30:

  • build-packet-centos74-armv8-1x/config.xml

And from 139.178.82.234:

  • build-docker-fedora33-armv8-2
  • build-docker-fedora33-armv8-3
  • build-docker-fedora33-armv8-4
  • build-docker-fedora33-armv8-5
  • build-docker-ubuntu1804-armv8-2
  • build-docker-ubuntu1804-armv8-3
  • build-docker-ubuntu1804-armv8-4
  • build-docker-ubuntu1804-armv8-5
  • build-docker-ubuntu1804-armv8-6
  • build-packet-ubuntu1804-armv8l-1

@karianna karianna modified the milestones: May 2021, June 2021 Jun 9, 2021
@sxa
Copy link
Member Author

sxa commented Jun 13, 2021

Looks like we haven't suffered any significant problems after deactivating some of the older servers so we should be ok to hand them back now (Although I might hold onto one of the D05 systems until we fully get the third Altra up and running)

@sxa
Copy link
Member Author

sxa commented Jun 13, 2021

Just as a final note, here are the build times for JDK17/HotSpot with and without ccache - all in docker except (ND). Figures are real/user time.

System No ccache Ccache
Altra 160core/512Gb 7m41/75m28 4m29/20m38
D05 32core/128Gb 10m54/131m43 5m51/28m18
eMag 32core/128Gb 13m14/138m12 7m29//27m18
ThunderX (ND) 19m52/246m30 10m40/50m24
ThunderX 96core/32Gb 24m49/351m15 12m44/52m8

@sxa
Copy link
Member Author

sxa commented Jun 15, 2021

Have rebooted my primary Ubuntu Ampere system today because all execution of docker commands were hanging. The docker machines appear to have been unaffected but I'm hoping this will be a one-off.

@sxa
Copy link
Member Author

sxa commented Jun 15, 2021

Reboot has started everything ok and the docker commands are working. Purely for reference, the current list of static docker agents on the machine is as follows:

                                                          
root@test-equinix-ubuntu2004-armv8l-02:~# docker ps
CONTAINER ID   IMAGE            COMMAND               CREATED        STATUS                        PORTS                                   NAMES
df7cd16ce177   u2004.32.build   "/bin/bash"           7 weeks ago    Restarting (0) 1 second ago                                           U2004_2321.build32
1bfd2a038c8d   aqa_u2004.32     "/usr/sbin/sshd -D"   7 weeks ago    Up 22 minutes                 0.0.0.0:2324->22/tcp, :::2324->22/tcp   U2004_2324.32
db94894974b0   aqa_u2004.32     "/usr/sbin/sshd -D"   7 weeks ago    Up 22 minutes                 0.0.0.0:2323->22/tcp, :::2323->22/tcp   U2004_2323.32
412c7c344fe5   aqa_u2004.32     "/usr/sbin/sshd -D"   7 weeks ago    Up 22 minutes                 0.0.0.0:2322->22/tcp, :::2322->22/tcp   U2004_2322.32
fd3994e02f15   aqa_arm32        "/usr/sbin/sshd -D"   7 weeks ago    Up 23 minutes                 0.0.0.0:3224->22/tcp, :::3224->22/tcp   u2004.arm32.3224
d479dc5604bf   aqa_f33          "/usr/sbin/sshd -D"   2 months ago   Up 23 minutes                 0.0.0.0:2227->22/tcp, :::2227->22/tcp   f33_2227
e3ec9e799892   aqa_u2004        "/usr/sbin/sshd -D"   2 months ago   Up 23 minutes                 0.0.0.0:2232->22/tcp, :::2232->22/tcp   U2004_2232
3c5a5f4507cc   aqa_u2010        "/usr/sbin/sshd -D"   2 months ago   Up 22 minutes                 0.0.0.0:2231->22/tcp, :::2231->22/tcp   U2010_2231
a96fd12a60b1   aqa_u1804        "/usr/sbin/sshd -D"   2 months ago   Up 22 minutes                 0.0.0.0:2230->22/tcp, :::2230->22/tcp   U1804_2230
4327bf0f2675   aqa_u2004        "/usr/sbin/sshd -D"   2 months ago   Up 23 minutes                 0.0.0.0:2226->22/tcp, :::2226->22/tcp   U2004_2226
ec8eb9ece268   aqa_u1604        "/usr/sbin/sshd -D"   2 months ago   Up 22 minutes                 0.0.0.0:2222->22/tcp, :::2222->22/tcp   U1604_2222

@sxa
Copy link
Member Author

sxa commented Jun 22, 2021

All old machines other than D05 139.178.82.234 have been decomissioned and returned to Equinix as per. Holding onto that for a few more days as one of the jenkins agents on there was still live.

There are also instances of the new Ubuntu 16.04/armv7l docker container being run which will allow us to use this hardware for running the 32-bit ARM builds.

@sxa
Copy link
Member Author

sxa commented Jun 29, 2021

Decomissioned an extra two that weren't properly on our records:

  • eMag 147.75.80.54
  • ThunderX 147.75.33.166

@sxa
Copy link
Member Author

sxa commented Jun 29, 2021

Writing this for historic reference:

System Cores Core type
D05 64 Cortext-A72
ThunderX 96 ThunderX
eMag 32 Custom?
Altra 160 Neoverse-N1

@sxa sxa modified the milestones: June 2021, July 2021 Jul 5, 2021
@Haroon-Khel Haroon-Khel removed this from the July 2021 milestone Aug 3, 2021
@Haroon-Khel Haroon-Khel added this to the August milestone Aug 3, 2021
@sxa sxa modified the milestones: August 2021, September 2021 Sep 23, 2021
@sxa
Copy link
Member Author

sxa commented Sep 23, 2021

Next on the list is to get the CentOS system fully running with more capacity.

@sxa sxa modified the milestones: September 2021, December 2021 Dec 1, 2021
@sxa sxa modified the milestones: December 2021, 2022-01 (January) Jan 6, 2022
@sxa
Copy link
Member Author

sxa commented Feb 23, 2022

CentOS system will be replaced as part of #equinix now that CentOS8 is no longer supported. closing.

@sxa sxa closed this as completed Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants