Skip to content

OCaml Build Infrastructure

mtelvers edited this page Mar 11, 2022 · 43 revisions

This page is work in progress, by @avsm and @mtelvers

There is a hosted machine cluster operated by OCaml Labs that runs regular health checks against the opam package repository, and also runs 'cron' jobs to keep the operational side of things running.

Users can access this information through a variety of mechanisms:

Cluster Worker Machines

  • linux-x86_64

    • apache (Dell PowerEdge R7425): 128 threads
    • asteria (Dell PowerEdge R6525): 256 threads
    • clete (Dell PowerEdge R630): 72 threads
    • cree (Dell PowerEdge R7425): 128 threads
    • doris (Dell PowerEdge R6525): 256 threads
    • hipp (Dell PowerEdge R630): 72 threads
    • iphito (Dell PowerEdge R6525): 256 threads
    • laodoke (Dell PowerEdge R630): 72 threads
    • m1-a (Supermicro SYS-2028BT-HNC0R+/X10DRT-B+): 48 threads
    • m1-b (Supermicro SYS-2028BT-HNC0R+/X10DRT-B+): 48 threads
    • m1-c (Supermicro SYS-2028BT-HNC0R+/X10DRT-B+): 48 threads
    • marpe (Dell PowerEdge R6525): 256 threads
    • phoebe (Dell PowerEdge R630): 72 threads
    • pima (Dell PowerEdge R7425): 128 threads
    • x86-bm-1 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-2 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-3 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-4 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-5 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-6 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-7 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-8 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-9 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-10 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-11 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-12 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-13 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
    • x86-bm-14 (Quanta Cloud QuantaPlex T22HF-1U/S5HF MB): 32 threads
  • linux-ppc64

    • orithia (Raptor Talos II): 176 threads
    • oya (IBM PowerNV 9006-22P): 128 threads
    • pisto (IBM Power System S812LC): 80 threads
    • prothoe (IBM Power System S812LC): 80 threads
    • scyleia (Raptor Talos II): 176 threads
  • linux-arm64

    • ainia (Avantek Ampere(TM) Mt Snow): 80 threads
    • arm64-jade-1 (Avantek Ampere(TM) Mt Jade): 160 threads
    • arm64-jade-2 (Avantek Ampere(TM) Mt Jade): 160 threads
    • kydoime (Avantek Ampere(TM) Mt Snow): 80 threads
    • molpadia (Avantek Ampere(TM) Mt Snow): 80 threads
    • okypous (Avantek Ampere(TM) Mt Snow): 80 threads
  • windows-x86_64

    • odawa (Dell PowerEdge R6525): 256 threads
  • linux-s390x

    • s390x (IBM/S390): 4 threads

Other Machines

  • Cambridge Computer Laboratory:
    • aello: avsm dev
    • bremusa: multicore dev, debugging & benchmarking (tom kelly/sadiq/engil)
    • eumache: hub.ocamllabs.io (jonludlam)
    • hipp: xcp-ng test (avsm)
    • toxis: ci.ocamllabs.io
    • grumpy: benchmarking
    • dopey: off
    • roo: benchmarking
    • tigger: off (slow i/o. 48 core)
    • uriel: off
    • Gabriel: hardware fault
    • Michael: spare
    • raphael: hardware fault
    • simba:
    • pima: ci.ocaml.org (talex5/kate)
    • hopi: dra27 and windows
    • comanche: @samoht for irmin
    • navajo: multicore CB machine (kc/shakthi/gargi)
    • summer: developer dev machine (linux) with lots of accounts
    • autumn: magnus for irmin benchmarking
    • winter: multicore dev, debugging & benchmarking (tom kelly/sadiq/engil)
    • spring: shakthi/damien for OCaml flambda sandmark
    • caelum-512: off (previously: avsm for xen)
    • caelum-514: off (previously: openbsd for opam infra)
    • caelum-613: off (previously: v3.ocaml.org)
    • caelum-614: off (previously: F* benchmarking (bench2.ocamllabs.io) (tom kelly)

Cambridge Computer Laboratory

PXE Booting for Linux AMD64 and ARM64 via UEFI and BIOS

PXE booting is provided by Netboot.xyz which is running via Docker on doc.ocamllabs.io.

This is the docker-compose.yml file

---
version: "2.1"
services:
  netbootxyz:
    image: ghcr.io/netbootxyz/netbootxyz
    container_name: netbootxyz
    environment:
      - MENU_VERSION=2.0.47 # optional
    volumes:
      - /netbootxyz/config:/config
      - /netbootxyz/assets:/assets
    ports:
      - 3000:3000
      - 69:69/udp
      - 8080:80
    restart: unless-stopped

Which is started using:

docker-compose up -d netbootxyz

The Computer Lab primarily uses serial ports (via iDRAC) for console access. Therefore, it is necessary to edit /netbootxyz/config/menus/boot.cfg and configure the serial port console:

set cmdline console=tty0 console=ttyS1,115200n8

The following global parameters are included in the DHCP server configuration on the Ubiquiti EdgeRouter which selects the correct PXE image depending upon the machine architecture.

set service dhcp-serverbootfile-server doc.ocamllabs.io
set service dhcp-server global-parameters "class "BIOS-x86" { match if option arch = 00:00; filename "netboot.xyz.kpxe"; }"
set service dhcp-server global-parameters "class "UEFI-x64" { match if option arch = 00:09; filename "netboot.xyz.efi"; }"
set service dhcp-server global-parameters "class "UEFI-ARM64" { match if option arch = 00:0b; filename "netboot.xyz-arm64.efi"; }"
set service dhcp-server global-parameters "class "UEFI-bytecode" { match if option arch = 00:07; filename "netboot.xyz.efi"; }"

Petitboot

The Power9 machines run Petitboot which negates the need to PXE boot the machines. First, download the Ubuntu ISO for PPC64: ubuntu-21.10-live-server-ppc64el.iso. Extract the /casper folder and stage it and the ISO file on a web server.

Furthermore, stage a configuration file called petitboot.conf on a web server containing the following:

label Ubuntu 21.10 ppc64el
        kernel http://doc.ocamllabs.io:8080/ubuntu-21.10-live-server-ppc64el/casper/vmlinux
        initrd http://doc.ocamllabs.io:8080/ubuntu-21.10-live-server-ppc64el/casper/initrd
        append ip=dhcp url=http://doc.ocamllabs.io:8080/ubuntu-21.10-live-server-ppc64el.iso

Then add a DHCP option to provide the URL of the configuration to the Petitboot environment:

global-parameters "class "Arch-Unknown" { match if option arch = 00:0e; option pxelinux-configfile "http://doc.ocamllabs.io:8080/caelum-petitboot.conf"; }"

Microsoft Windows

Windows remote booting can be added to Netboot.xyz by creating a Windows Lite Touch installation ISO using Microsoft Deployment Toolkit.

mount LiteTouchPE_x64.iso /mnt
mkdir /netbootxyz/assets/windows/x64
cp -r /mnt/* /netbootxyz/assets/windows/x64

And updating the Netboot.xyz config file boot.cfg setting

set win_base_url http://doc.ocamllabs.io:8080/windows

Dynamic DNS

doc.ocamllabs.io also runs bind (apt install bind9) which is configured with these blocks in /etc/bind/named.conf.local. The reverse zone isn't used but is included to avoid the error messages as the EdgeMax tries to update it.

zone "caelum.tarides.com" {
	type master;
	file "/var/lib/bind/db.caelum";
	allow-update { key rndc-key; };
	allow-transfer { none; };
};

zone "124.232.128.in-addr.arpa" {
        type master;
        file "/var/lib/bind/db.124.232.128";
        allow-update { key rndc-key; };
        allow-transfer { none; };
};

And in /var/lib/bind/db.caelum add the following

;
; BIND data file for Caelum network
;
$TTL	604800
@	IN	SOA	doc.caelum.tarides.com. mark.tarides.com. (
			      5		; Serial
			 604800		; Refresh
			  86400		; Retry
			2419200		; Expire
			 604800 )	; Negative Cache TTL
;
@	IN	NS	doc.caelum.tarides.com.

doc	IN	A	128.232.124.240

And in /var/lib/bind/db.124.232.128 add the following

$TTL 604800     ; 1 week
124.232.128.in-addr.arpa IN SOA doc.caelum.tarides.com. mark.tarides.com. (
                                84         ; serial
                                604800     ; refresh (1 week)
                                86400      ; retry (1 day)
                                2419200    ; expire (4 weeks)
                                604800     ; minimum (1 week)
                                )
                        NS      doc.
240                     PTR     doc.caelum.tarides.com.

These glue records are included in the tardies.com domain:

doc.caelum 1800 IN A 128.232.124.240
caelum 1800 IN NS doc.caelum.tarides.com.

And on the EdgeMax turn on dynamic-dns-updates and provide the configuration via global-parameters:

set service dhcp-server dynamic-dns-update enable true  
set service dhcp-server global-parameters "key rndc-key { algorithm hmac-md5; secret "XXX"; };"
set service dhcp-server global-parameters "zone caelum.tarides.com. { primary 128.232.124.240; key rndc-key; }"
set service dhcp-server global-parameters "ddns-domainname "caelum.tarides.com.";"
set service dhcp-server global-parameters "zone 124.232.128.in-addr.arpa { primary 128.232.124.240; key rndc-key; }"

Packet filter

There is an Ubiquiti EdgeRouter at gw.ocamllabs.io that acts as a router and packet filter for the internal machines. A quick guide to using it is via an ssh terminal is:

$ configure
[edit]
# edit firewall name CL_IN4
[edit firewall name CL_IN4]
# show
[edit firewall name CL_IN4]
# set rule 20 destination address 128.232.124.213
[edit firewall name CL_IN4]
# set rule 20 protocol tcp
[edit firewall name CL_IN4]
# set rule 20 destination port 80,443,8100
[edit firewall name CL_IN4]
# set rule 20 action accept
[edit firewall name CL_IN4]
# set rule 20 description ci.ocamllabs.io
[edit firewall name CL_IN4]
# show
<examine diff>
[edit firewall name CL_IN4]
# commit
[edit firewall name CL_IN4]
# save
Saving configuration to '/config/config.boot'...
Done
[edit]
# exit