This repo contains instructions for configuring, deploying, and running the LIMS for the metabolomics team at Joint Genome Institute. This LIMS is hosted at metatlas.lbl.gov and is based on the community edition of LabKey. The metatlas LIMS is deployed and hosted on NERSC's SPIN platform for running containered services using Kubernetes via Rancher (v2).
A previous configuration of the metatlas LIMS was set up with LabKey's non-embedded web server versions (up to major version 23), whereas this configuration deploys the embedded web server version (major version 24+). Use these instructions for updating, deploying, and debugging the LIMS SPIN app as of July, 2024. Please see previous commit history of this repository for deploying the old LabKey LIMS v23.
LabKey provides overall installation instuctions and instructions for setting up the required components. But reading those docs is not necessary if deploying from this repo without modification.
The layout of the github repository for controlling the LIMS is:
$ tree -L 2
.
├── LICENSE
├── README.md
├── backup_restore
│ ├── Dockerfile
│ ├── Makefile
│ ├── backup.yaml.template
│ ├── bin
│ ├── build.sh
│ ├── make_command.sh
│ ├── restore-root.yaml.template
│ └── restore.yaml.template
├── db
│ ├── db-data.yaml
│ └── db.yaml
├── deploy.sh
└── labkey
├── Dockerfile
├── LICENSE
├── Makefile
├── R_smkosina01-lock.yaml
├── R_smkosina01.yaml
├── R_tidyverse-lock.yaml
├── R_tidyverse.yaml
├── VERSION
├── application.properties
├── docker-compose.yml
├── entrypoint.sh
├── labkey-files.yaml
├── labkey.yaml.template
├── labkey_server.service
├── lb.yaml.template
├── log4j2.xml
├── make_command.sh
├── python-lock.yaml
├── python.yaml
├── scripts
├── startup
├── update_lock.sh
└── xvfb.sh
Each subdirectory corresponds to a pod within each Rancher production workload in the
lims-24 SPIN namespace and contains a kubernetes .yaml
file(s) used to configure the pod.
The major components of the system are the following docker images which run in the pods:
backup_restore
: Daily cron job that performs a backup of the database and files (/usr/local/labkey/files/
in the container) to the global perlmutter filesystem at/global/cfs/cdirs/metatlas/projects/lims_backups/pg_dump/lims-24
. Also used during a new deployment to restore the database from the backup archive.db
: postgres database base docker imagelabkey
: LabKey community edition web application with an embedded Apache Tomcat.
If the LIMS or an associated pod/service stop running (i.e., software instability, NERSC maintenance, etc.) and it is not necessary to update any software, you can redeploy the existing workloads via the Rancher2 interface:
- Go to the LabKey pod page on Rancher2 in the m2650 project and the lims-24 namespace
- Reduce the 'Scale' to 0 to spin down the pod running LIMS (
labkey
) by clicking next to the colored bar - Wait for the running pod to be fully offline
- If you need to use a different version of the docker image, click the triple-dot button in near the upper right corner of the Rancher2 web page and then select 'Edit Config' from the dropdown menu.
- Replace the image tag (e.g., labkey24.3.4-6_2024-06-24-11-40) in the 'Container image' field with the tag from the newer build
- Click 'Save' button
- On the pod page, dial up the 'Scale' to 1 by clicking next to the colored bar
- Wait for the LabKey pod to come up and be ready. You may want to view the pod logs while you wait to see if there are any errors -- see the triple-dot menu at the right side of the pod row.
- Go to metatlas.lbl.gov to verify the server is working.
Occasionally, the LabKey, postgres, or other associated software (e.g., python/R libraries) need to be upgraded or updated. Since this typically involves changing the underlying docker images or their dependents, a sequence of edits, repo commits, image building/pushing, and SPIN deployment must be followed to bring the system back online from scratch.
The python.yaml
and R_*.yaml
files within the labkey
directory define conda
environments which are made available to the LabKey webserver to run user scripts.
These environments also have corresponding
lock files named *-lock.yaml
.
If no changes are required to python/R environments for the new deployment, you can skip this section.
If you add or update an environment yaml file (e.g., to add new libraries or change existing library
versions), you must run update_lock.sh example.yaml
to generate an updated lock file before deploying the LIMS. To do this:
- Install the the conda-lock package with
pip install conda-lock
- If running on a Mac with Apple's M1/M2 architecture, edit the
update_lock.sh
script to include “-p linux-aarch64” flag in theconda-lock
command (if not already there). - Run the update lock script on all yaml files that have been added or edited
./update_lock.sh R_smkosina01.yaml
./update_lock.sh R_tidyverse.yaml
./update_lock.sh python.yaml
./update_lock.sh ...
The SPIN pods that run the LIMS (labkey
) and the data backup system (backup_restore
) are set up from
images built on your local machine and pushed to the
NERSC container registry.
While these images already exist in the registry, to upgrade the LabKey or postgres software versions that
run labkey
and backup_restore
, respectively, it is necessary to rebuild the images.
Follow these general steps to build, tag, and push new images, then see the sections below for
specific instructions for labkey
and backup_restore
.
- Install docker or podman on your local machine.
- Git clone (or pull) this repo to your local machine:
git clone https://github.com/biorack/labkey_deploy
- Enter the repo directory and ensure it matches the structure shown in the
tree -L 2
command above. - If building the docker images from a Macbook with Apple's M1/M2 architecture, first install the
buildx kit and have it running in your local docker desktop. This will allow you to build an image that can be run
on perlmutter AMD architecture. The
docker build
instructions described below will detect if you are running from a machine with Apple arch and use thebuildx kit
accordingly.
- Enter the labkey subdirectory (
cd labkey_deploy/labkey/
) - Edit the
make_command.sh
so that:
- The
LABKEY_VERSION
flag matches the new version for which you're trying to create an image (e.g.,LABKEY_VERSION=24.3.4-6
). - The
NEW_DOWNLOAD
variable is 1 if you're downloading the LabKey software from online for the update. This will runscripts/download_lims_distribution.sh
that downloads the LIMS distribution, creates the correct directory structure, and moves the required files their location in the repo before building the image. SetNEW_DOWNLOAD
variable to 0 to skip this process (e.g., if you're troublshooting and already have the LabKey files downloaded/moved into the repo). - If you do not know it, you can find the LabKey version number by filling out the Labkey download request form and then looking at the URL for the
tar.gz
download.
- Run
docker login registry.spin.nersc.gov
and enter your credentials for the registry. - Run
./make_command.sh
- This command will run through the
labkey
dir Makefile and run a login, build, tag, push sequence to containerize the labkey docker image on the NERSC repository. - If
NEW_DOWNLOAD
is set to 1, you should see the LabKey software curl download. - Then you should see the docker image get built, tagged, and pushed. Navigate to the NERSC registry's metabolomics project directory and ensure your tagged image is present.
- Note on July 26, 2024: An error occured during the docker build phase that was solved by changing the GID and UID for mamba from 1000 to 999. If this occurs in the future, you made need to change to another value.
- Enter the labkey subdirectory (
cd labkey_deploy/backup_restore/
) - Optionally, edit the
Dockerfile
and update the base image inFROM postgres:15-alpine
. If you do update this base image, you should edit the repo'sdb/db.yaml
fileimage:
line to match. - Run
./make_command.sh
- This command will run through the
backup_restore
dir Makefile and run a login, build, tag, push sequence to containerize the backup+restore docker image on the NERSC repository. - You should see the docker image get built, tagged, and pushed. Navigate to the NERSC registry's metabolomics project directory and ensure your tagged image is present.
Once the software update(s), local image building, and push to the NERSC registry is completed, push your
local repo changes to labkey_deploy
main (or your branch, then merge with main).
Now that images are updated, deploy the LIMS in SPIN.
The LIMS, backup+restore, postgres db, and load balancer are started up in SPIN using the kubernetes-based
deploy script deploy.sh
in the main repo directory. The deployment must happen on a NERSC system
(e.g., a perlmutter login node).
- Git clone (or pull) this repo on perlmutter:
git clone https://github.com/biorack/labkey_deploy
- In the root directory of the repo, create a .secrets file:
cd labkey_deploy
touch .secrets
chmod 600 .secrets
echo "POSTGRES_PASSWORD=MyPostgresPassWord" > .secrets
echo "MASTER_ENCRYPTION_KEY=MyLabkeyEncryptionKey" >> .secrets
- Secrets can be identified within the existing SPIN app by starting a Rancher terminal and echoing the environmental variables above, or on perlmutter in the current deployment location.
- In the root directory of the repo, create the following files containing your TLS private key and certificate:
- .tls.metatlas.lbl.gov.key (if working with the dev instance, use .tls.metatlas-dev.lbl.gov.key)
- .tls.metatlas.lbl.gov.pem (if working with the dev instance, use .tls.metatlas-dev.lbl.gov.pem)
- The certificate should be PEM encoded, contain the full chain, and be in reverse order (your cert at top to root cert at bottom).
- To obtain a certificate for metatlas.lbl.gov, follow these instructions.
- Ensure the .tls.key file is only readable by you, e.g.:
chmod 600 .tls.metatlas.lbl.gov.key
- Run the deployment script from the root directory of the repo on the perlmutter login node:
deploy.sh --labkey registry.nersc.gov/m2650/lims/labkey/community:labkeyVERSION_YYYY-MM-DD-HH-SS --backup registry.nersc.gov/m2650/lims/labkey/community:backup_restore_YYYY-MM-DD-HH
- You'll need to pass flags for the correct tags of the
labkey
andbackup_restore
docker images. Set the version and timestamps to match the tags in the registry (these are also printed locally after runningmake_command.sh
). - If deploying when the system is inactive (i.e., no pods are running and/or the persistant volumes
do not already contain a populated database and filesystem), pass the
--new
flag. The--new
flag will restore backups of both the database and the filesystem where labkey stores files. By default,--new
uses the most recent backups, but you can use--timestamp
to select a specific backup. Seedeploy.sh
for more details on flags, including deployment to the SPIN production cluster (default). vs. development cluster.
- You'll need to pass flags for the correct tags of the
- While the deploy runs, you should see useful messages printed to standard output on the login node, and you can watch the restore, db mounting, and LabKey software boot roll out in order on Rancher.
- When successfully deployed, the LIMS should be reachable at metatlas.lbl.gov and all the pods should have ready status in Rancher. Troubleshooting can be done in Rancher by clicking the three-dot button to the right of a workload and executing a shell or looking at logs.
- It is also a good idea to manually run the backup cronjob from Rancher and check that it is working
properly by looking at the backup location (currently
/global/cfs/cdirs/metatlas/projects/lims_backups/pg_dump/lims-24
).
SPIN has a development cluster in addition to the production cluster that a typical LIMS deployment occurs is. Sometimes it is useful to test a new LabKey version in the development cluster before moving to the production cluster if there has been a major version change or you're skipping versions during an update. To deploy in the development environment, use the following steps:
- Follow steps 1-4 in the deploy instructions above.
- Run the deployment script from the root directory of the repo on the perlmutter login node:
deploy.sh --dev --new --labkey registry.nersc.gov/m2650/lims/labkey/community:labkeyVERSION_YYYY-MM-DD-HH-SS --backup registry.nersc.gov/m2650/lims/labkey/community:backup_restore_YYYY-MM-DD-HH
- Note that you'll need to use the
--dev
flag, which spins up workloads in the SPIN development cluster
- This LIMS deployment is reachable at metatlas-dev.lbl.gov and will not interfere with the current production LIMS. This is the advantage of the dev cluster.