Skip to content

Commit

Permalink
Merge pull request #1 from InformaticsMatters/m2ms-1486
Browse files Browse the repository at this point in the history
Maintenance release
  • Loading branch information
alanbchristie authored Sep 21, 2024
2 parents a285db1 + c77313a commit eca489e
Show file tree
Hide file tree
Showing 9 changed files with 98 additions and 47 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.idea/
venv/

parameters.yaml
parameters*.yaml
vault-pass.txt
67 changes: 38 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@
Ansible playbooks for the Kubernetes-based execution of [fragmentor]
**Playbooks**.

This repository's `site-player` play launches a _player_ Pod in Kubernetes your
Kubernetes cluster. The player Pod can run each stage of our fragmentation process.
The player understands how to run the `standardise`, `fragment`, `inchi`, and `extract`
playbooks (in our [fragmentor] repository) by _injecting_ your parameter, kubeconfig
and nextflow files into the player, which then runs the fragmentor playbook
you name.

Before you attempt to execute any fragmentation plays...

1. You will need a Kubernetes cluster with a ReadWriteMany storage class
Expand Down Expand Up @@ -38,6 +45,9 @@ Before you attempt to execute any fragmentation plays...
fragmentation/graph data.
11. You will need your Kubernetes config file.
12. You will need AWS credentials (that allow for bucket access).
13. You will need to be able to run `kubectl` from the command-line
as the player expects to use it to obtain the cluster host and its IP.
So ensure that `KUBECONFIG` is set appropriately.

## Kubernetes namespace setup
You can conveniently create the required namespace and database using our
Expand All @@ -49,11 +59,11 @@ You can conveniently create the required namespace and database using our

Start from the project root of a clone of the repository: -

$ python -m venv venv
python -m venv venv

$ source venv/bin/activate
$ pip install --upgrade pip
$ pip install -r requirements.txt
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

...and create the database and corresponding namespace using an Ansible
YAML-based parameter file. Here's an example that should work for 'small'
Expand Down Expand Up @@ -84,13 +94,13 @@ pg_mem_limit: 4Gi

You will need to set a few Kubernetes variables...

$ export K8S_AUTH_HOST=https://example.com
$ export K8S_AUTH_API_KEY=1234
$ export K8S_AUTH_VERIFY_SSL=no
export K8S_AUTH_HOST=https://example.com
export K8S_AUTH_API_KEY=1234
export K8S_AUTH_VERIFY_SSL=no

Then run the playbook...

$ ansible-playbook site.yaml -e @parameters.yaml
ansible-playbook site.yaml -e @parameters.yaml
[...]

## Running a fragmentor play
Expand All @@ -115,30 +125,30 @@ To run a play you must set a set of play-specific parameters in the local file

Start from a virtual environment: -

$ python -m venv venv
python -m venv venv

$ source venv/bin/activate
$ pip install --upgrade pip
$ pip install -r requirements.txt
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

As always, set a few key environment parameters: -

$ export K8S_AUTH_HOST=https://example.com
$ export K8S_AUTH_API_KEY=?????
$ export K8S_AUTH_VERIFY_SSL=no
export K8S_AUTH_HOST=https://example.com
export K8S_AUTH_API_KEY=?????
export K8S_AUTH_VERIFY_SSL=no

$ export KUBECONFIG=~/.kube/config
export KUBECONFIG=~/.kube/config

For access to AWS S3: -

$ export AWS_ACCESS_KEY_ID=?????
$ export AWS_SECRET_ACCESS_KEY=?????
export AWS_ACCESS_KEY_ID=?????
export AWS_SECRET_ACCESS_KEY=?????

You _name_ the play to run using our playbook's `fp_play` variable.
In this example we're running the *database reset* play and setting
the storage class to `nfs`: -

$ ansible-playbook site-player.yaml \
ansible-playbook site-player.yaml \
-e fp_play=db-server-configure_create-database \
-e fp_work_volume_storageclass=nfs

Expand Down Expand Up @@ -179,8 +189,7 @@ extracts:
regenerate_index: yes
hardware:
production:
parallel_jobs: 8
cluster_cores: 8
parallel_jobs: 360
sort_memory: 4GB
postgres_jobs: 8
```
Expand All @@ -192,32 +201,32 @@ hardware:
with key records.

```
$ ansible-playbook site-player.yaml \
ansible-playbook site-player.yaml \
-e fp_play=db-server-configure_create-database
```
- **Standardise**
```
$ ansible-playbook site-player.yaml -e fp_play=standardise
ansible-playbook site-player.yaml -e fp_play=standardise
```
- **Fragment**
```
$ ansible-playbook site-player.yaml -e fp_play=fragment
ansible-playbook site-player.yaml -e fp_play=fragment
```
- **InChi**
```
$ ansible-playbook site-player.yaml -e fp_play=inchi
ansible-playbook site-player.yaml -e fp_play=inchi
```
- **Extract** (a dataset to graph CSV files)
```
$ ansible-playbook site-player.yaml -e fp_play=extract
ansible-playbook site-player.yaml -e fp_play=extract
```
- **Combine** (multiple datasets into graph CSV files)
Expand Down Expand Up @@ -258,15 +267,15 @@ hardware:
```

```
$ ansible-playbook site-player.yaml -e fp_play=combine
ansible-playbook site-player.yaml -e fp_play=combine
```

## A convenient player query playbook
If you don't have visual access to the cluster you can run
the following playbook, which summarises the phase of the currently executing
play. It will tell you if the current play is still running.

$ ansible-playbook site-player_query.yaml
ansible-playbook site-player_query.yaml

It finishes with a summary message like this: -

Expand All @@ -282,7 +291,7 @@ ok: [localhost] => {
If the player is failing, and you want to kill it, and the Job that
launched it, you can run the kill-player playbook: -

$ ansible-playbook site-player_kill-player.yaml
ansible-playbook site-player_kill-player.yaml

---

Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
ansible == 8.7.0
dnspython == 2.6.1
jmespath == 1.0.1
kubernetes == 23.6.0
openshift == 0.13.2
12 changes: 7 additions & 5 deletions roles/player/defaults/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
fp_play: SetMe

# The user's kubernetes configuration file.
# The user must set KUBECONFIG - we do not assume ~/.kube/config
fp_kubeconfig_file: "{{ lookup('env', 'KUBECONFIG') }}"
# The user must define this variable - we no-longer rely on KUBECONFIG.
fp_kubeconfig_file: SetMe

# The namespace that is expected to exist.
fp_namespace: fragmentor
Expand All @@ -31,7 +31,7 @@ fp_parameter_file: parameters.yaml
# Details of the fragmentation player container image
fp_image_registry: ''
fp_image_name: informaticsmatters/fragmentor-player
fp_image_tag: '1.1.0'
fp_image_tag: '1.2.0'

# The nextflow version to run.
# The player image generally contains the 'latest' nextflow version.
Expand All @@ -46,8 +46,10 @@ fp_image_tag: '1.1.0'
#
# See https://github.com/nextflow-io/nextflow/issues/1902
fp_nextflow_version: '21.02.0-edge'
# And the Nextflow queue size
fp_nextflow_queue_size: 100
# And the Nextflow executor queue size
fp_nextflow_executor_queue_size: 100
# And Pod pull policy
fp_nextflow_pod_image_pull_policy: 'IfNotPresent'

# A pull-secret for public images pulled from DockerHub.
# If set this is the base-64 string that can be used as the value
Expand Down
4 changes: 2 additions & 2 deletions roles/player/tasks/deploy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@

- name: Assert queue size
assert:
that: fp_nextflow_queue_size|int > 0
fail_msg: You must set a sensible 'fp_nextflow_queue_size'
that: fp_nextflow_executor_queue_size|int > 0
fail_msg: You must set a sensible 'fp_nextflow_executor_queue_size'

# Assert the Kubernetes config has been named and exists

Expand Down
32 changes: 32 additions & 0 deletions roles/player/tasks/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
- name: Include prep
include_tasks: prep.yaml

- name: Load parameters from {{ fp_parameter_file }}
include_vars:
file: "{{ fp_parameter_file }}"

# A kubernetes host and an API key must be set.
# Either environment variables wil have been set by the user
# or AWX 'kubernetes' credentials will have injected them.
Expand All @@ -14,6 +18,34 @@
- k8s_auth_host|length > 0
- k8s_auth_api_key|length > 0

- name: Assert kubeconfig file is named
assert:
that:
- fp_kubeconfig_file|length > 0
- fp_kubeconfig_file!='SetMe'

# Discover the hostname (an IP address) of the kubernetes cluster
# control plane. We do this to set a host alias in the Player Pod
# to avoid the need for a DNS lookup (something that may be unreliable on
# the chosen cluster)

- name: Run kubectl (to get the host)
command: kubectl config view --minify --output 'jsonpath={.clusters[0].cluster.server}'
register: k8s_host
changed_when: false

- name: Extract k8s hostname
set_fact:
k8s_hostname: "{{ k8s_host.stdout_lines[0] | urlsplit('hostname') }}"

- name: Use Python's 'dig' to get the IP address
set_fact:
k8s_ip: "{{ lookup('dig', k8s_hostname) }}"

- name: Display k8s hostname and address
debug:
msg: k8s_hostname={{ k8s_hostname }} k8s_ip={{ k8s_ip }}

# Go...

# There is no 'undeploy' fragmentation is a 'Job'
Expand Down
10 changes: 6 additions & 4 deletions roles/player/templates/configmap-nextflow-config.yaml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,17 @@ metadata:
data:
config: |
process {
pod = [
nodeSelector: 'informaticsmatters.com/purpose-fragmentor=yes',
{% if all_image_preset_pullsecret_name|string|length > 0 %}
pod = [nodeSelector: 'informaticsmatters.com/purpose-fragmentor=yes', imagePullSecret: '{{ all_image_preset_pullsecret_name }}']
{% else %}
pod = [nodeSelector: 'informaticsmatters.com/purpose-fragmentor=yes']
imagePullSecret: '{{ all_image_preset_pullsecret_name }}',
{% endif %}
imagePullPolicy: '{{ fp_nextflow_pod_image_pull_policy }}'
]
}
executor {
name = 'k8s'
queueSize = {{ fp_nextflow_queue_size }}
queueSize = {{ fp_nextflow_executor_queue_size }}
}
k8s {
serviceAccount = 'fragmentor'
Expand Down
11 changes: 7 additions & 4 deletions roles/player/templates/job.yaml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,13 @@ spec:
matchExpressions:
- key: informaticsmatters.com/purpose-fragmentor
operator: Exists
# A host alias for the Kubernetes API.
# This ensures the host (and th IP address we provide)
# go into the Pod's /etc/hosts file and permit bypassing of DNS.
hostAliases:
- ip: "{{ k8s_ip }}"
hostnames:
- "{{ k8s_hostname }}"

{% if all_image_preset_pullsecret_name|string|length > 0 %}
imagePullSecrets:
Expand All @@ -36,11 +43,7 @@ spec:
{% else %}
image: {{ fp_image_name }}:{{ fp_image_tag }}
{% endif %}
{% if fp_image_tag in ['latest', 'stable'] %}
imagePullPolicy: Always
{% else %}
imagePullPolicy: IfNotPresent
{% endif %}
# The default termination log (here for clarity)
# But also fallback to stdout logs on error
# if there is no termination log.
Expand Down
6 changes: 4 additions & 2 deletions roles/player/vars/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,10 @@ fp_mem_limit: 4Gi
# The home directory in the fragmentor 'player' pod
fp_player_home: /root

# How long to hold-on to the player Pod
# (keep alive for post-run debug)
# How long to hold-on to the player Pod.
# If set the player Pod remains running for th defined period.
# This gives you an opportunity to shell int to the Pod and inspect
# the execution, essentially an ability to look around when the play has finished.
fp_keep_alive_seconds: 0

# General variables
Expand Down

0 comments on commit eca489e

Please sign in to comment.