Skip to content

Commit

Permalink
Merge pull request #233 from latchbio/aidan/dind+bugfix
Browse files Browse the repository at this point in the history
Aidan/dind+bugfix
  • Loading branch information
AidanAbd authored Feb 21, 2023
2 parents 470de08 + 2564b1c commit d8da2cc
Show file tree
Hide file tree
Showing 15 changed files with 219 additions and 82 deletions.
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,18 @@ Types of changes

### Fixed

* Internal state file should be automatically created when running `latch register` and `latch develop`

### Added

* `latch init`: Docker in Docker template workflow
* `latch init`: Docker base image
* Small, medium, and large tasks use the [Sysbox runtime](https://github.com/nestybox/sysbox) to run Docker and other system software within task containers

## 2.13.1 - 2023-02-17

### Fixed

* Add latch/latch_cli/services/init/common to pypi release

## 2.13.0 - 2023-02-17
Expand Down
10 changes: 5 additions & 5 deletions docs/source/api/latch_cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,18 +57,18 @@ latch\_cli.tui module
:undoc-members:
:show-inheritance:

latch\_cli.types module
latch\_cli.utils module
-----------------------

.. automodule:: latch_cli.types
.. automodule:: latch_cli.utils
:members:
:undoc-members:
:show-inheritance:

latch\_cli.utils module
-----------------------
latch\_cli.workflow\_config module
----------------------------------

.. automodule:: latch_cli.utils
.. automodule:: latch_cli.workflow_config
:members:
:undoc-members:
:show-inheritance:
Expand Down
10 changes: 3 additions & 7 deletions docs/source/basics/defining_environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,7 @@

Workflow code is rarely free of dependencies. It may require python or system packages or make use of environment variables. For example, a task that downloads compressed reference data from AWS S3 will need the `aws-cli` and `unzip` [APT](https://en.wikipedia.org/wiki/APT_(software)) packages, then use the `pyyaml` python package to read the included metadata.

Latch manages the execution environment of a workflow using Dockerfiles. A Dockerfile specifies the environment in which the workflow runs.

Latch has three [base images](https://docs.docker.com/build/building/base-images/), one without hardware acceleration drivers, one with CUDA drivers, and one with OPENCL drivers. To use the CUDA or OPENCL base image, use the `--cuda` or `--opencl` flags when running `latch init`.

The workflow environment is encapsulated in [a Docker container](https://en.wikipedia.org/wiki/Docker_(software)), which is created from a recipe defined in [a text document called Dockerfile.](https://docs.docker.com/engine/reference/builder/) In most cases these recipes are repetitive and unnecessarily complicated, so Latch will automatically generate one using conventional dependency lists and heuristics. To use a handwritten Dockerfile, [run the eject command](#ejecting-auto-generation).
The workflow environment is encapsulated in [a Docker container](https://en.wikipedia.org/wiki/Docker_(software)), which is created from a recipe defined in [a text document named Dockerfile.](https://docs.docker.com/engine/reference/builder/). Latch provides [four baseline environments](../subcommands.md#base-image--b) which each latch workflow inherits from. In most cases, modifying the `Dockefile` manually is unnecessary, so Latch will automatically generate one using conventional dependency lists and heuristics. To use a handwritten Dockerfile, [run the eject command](#ejecting-auto-generation).

## Automatic Dockerfile Generation

Expand Down Expand Up @@ -258,9 +254,9 @@ To exclude files from the build use [a `.dockerignore`.](https://docs.docker.com

The default `.dockerignore` includes files auto-generated by Latch.

## Docker Limitations
## GPU Task Limitations

Docker containers have limitations on some system administration functions. While each workflow runs as a genuine `root` user (UID 0), commands that require [kernel capabilities](https://man7.org/linux/man-pages/man7/capabilities.7.html) will fail with "Permission denied". This includes `mount` and `chroot` among others.
Commands that require certain [kernel capabilities](https://man7.org/linux/man-pages/man7/capabilities.7.html) will fail with "Permission denied" in GPU tasks (`small-gpu-task`, `large-gpu-task`). This includes `mount` and `chroot` among others.

---

Expand Down
4 changes: 4 additions & 0 deletions docs/source/basics/environment/docker_recipes.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,7 @@ RUN curl -L https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.4.4/bowt
unzip bowtie2-2.4.4.zip &&\
mv bowtie2-2.4.4-linux-x86_64 bowtie2
```

## Run Docker in Docker

Use [`--base-image docker` with `latch init`](../../subcommands.md##latch-init) to use a base workflow environment which includes Docker. An example of running a containerized `bowtie2` aligner in a Latch workflow can be found using `latch init --template docker my_bowtie2_example`.
12 changes: 7 additions & 5 deletions docs/source/subcommands.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,15 @@ One of `r`, `conda`, `subprocess`, `empty`. If not provided, user will be prompt

Generate a Dockerfile for the workflow instead of relying on [auto-generation.](basics/defining_environment.md#automatic-dockerfile-generation)

#### `--cuda`
#### `--base-image`, `-b`

Make cuda drivers available to task code.
Each environment is build on one of the following base distributions:
- `default` with no additional dependencies
- `cuda` with Nvidia CUDA/cuDNN (cuda 11.4.2, cudnn 8) drivers
- `opencl` with OpenCL (ubuntu 18.04) drivers
- `docker` with the Docker daemon

#### `--opencl`

Make opencl drivers available to task code.
Only one option can be given at a time. If not provided or `default`, only the bare minimum packages to execute the workflow will be installed.

## `latch register`

Expand Down
10 changes: 10 additions & 0 deletions latch/resources/tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,9 @@ def _get_large_pod() -> Pod:
primary_container.resources = resources

return Pod(
annotations={"io.kubernetes.cri-o.userns-mode": "auto:size=65536"},
pod_spec=V1PodSpec(
runtime_class_name="sysbox-runc",
containers=[primary_container],
tolerations=[
V1Toleration(effect="NoSchedule", key="ng", value="cpu-96-spot")
Expand All @@ -108,7 +110,9 @@ def _get_medium_pod() -> Pod:
primary_container.resources = resources

return Pod(
annotations={"io.kubernetes.cri-o.userns-mode": "auto:size=65536"},
pod_spec=V1PodSpec(
runtime_class_name="sysbox-runc",
containers=[primary_container],
tolerations=[
V1Toleration(effect="NoSchedule", key="ng", value="cpu-32-spot")
Expand All @@ -129,7 +133,9 @@ def _get_small_pod() -> Pod:
primary_container.resources = resources

return Pod(
annotations={"io.kubernetes.cri-o.userns-mode": "auto:size=65536"},
pod_spec=V1PodSpec(
runtime_class_name="sysbox-runc",
containers=[primary_container],
),
primary_container_name="primary",
Expand Down Expand Up @@ -289,14 +295,18 @@ def custom_task(cpu: int, memory: int):
primary_container.resources = resources
if cpu < 48 and memory < 128:
task_config = Pod(
annotations={"io.kubernetes.cri-o.userns-mode": "auto:size=65536"},
pod_spec=V1PodSpec(
runtime_class_name="sysbox-runc",
containers=[primary_container],
),
primary_container_name="primary",
)
elif cpu < 96 and memory < 180:
task_config = Pod(
annotations={"io.kubernetes.cri-o.userns-mode": "auto:size=65536"},
pod_spec=V1PodSpec(
runtime_class_name="sysbox-runc",
containers=[primary_container],
tolerations=[
V1Toleration(effect="NoSchedule", key="ng", value="cpu-96-spot")
Expand Down
1 change: 0 additions & 1 deletion latch/types/directory.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,6 @@ def to_python_value(
"Casting from Pathlike to LatchDir is currently not supported."
)


while get_origin(expected_python_type) == Annotated:
expected_python_type = get_args(expected_python_type)[0]

Expand Down
2 changes: 1 addition & 1 deletion latch_cli/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
@dataclass(frozen=True)
class LatchConstants:
base_image: str = (
"812206152185.dkr.ecr.us-west-2.amazonaws.com/latch-base:ace9-main"
"812206152185.dkr.ecr.us-west-2.amazonaws.com/latch-base:9c8f-main"
)

mib: int = 2**20
Expand Down
14 changes: 9 additions & 5 deletions latch_cli/docker_utils/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,11 @@
from textwrap import dedent
from typing import List

import click
import yaml

from latch_cli.constants import latch_constants
from latch_cli.types import LatchWorkflowConfig
from latch_cli.workflow_config import LatchWorkflowConfig, create_and_write_config


class DockerCmdBlockOrder(str, Enum):
Expand Down Expand Up @@ -197,14 +198,17 @@ def generate_dockerfile(pkg_root: Path, outfile: Path) -> None:

print("Generating Dockerfile")
try:
with open(pkg_root / latch_constants.pkg_config) as f:
with (pkg_root / latch_constants.pkg_config).open("r") as f:
config: LatchWorkflowConfig = LatchWorkflowConfig(**json.load(f))
print(" - base image:", config.base_image)
print(" - latch version:", config.latch_version)
except FileNotFoundError as e:
raise RuntimeError(
"Could not find a .latch/config file in the supplied directory. If your workflow was created prior to release 2.13.0, you may need to run `latch init` to generate a .latch/config file."
) from e
print(
"Could not find a .latch/config file in the supplied directory. Creating configuration"
)
create_and_write_config(pkg_root)
with (pkg_root / latch_constants.pkg_config).open("r") as f:
config: LatchWorkflowConfig = LatchWorkflowConfig(**json.load(f))

with outfile.open("w") as f:
f.write("\n".join(get_prologue(config)) + "\n\n")
Expand Down
24 changes: 11 additions & 13 deletions latch_cli/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from latch_cli.exceptions.handler import CrashHandler
from latch_cli.services.init.init import template_flag_to_option
from latch_cli.utils import get_latest_package_version, get_local_package_version
from latch_cli.workflow_config import BaseImageOptions

latch_cli.click_utils.patch()

Expand Down Expand Up @@ -161,23 +162,20 @@ def login(connection: Optional[str]):
default=False,
)
@click.option(
"--cuda",
help="Create a user editable Dockerfile for this workflow.",
is_flag=True,
default=False,
)
@click.option(
"--opencl",
help="Create a user editable Dockerfile for this workflow.",
is_flag=True,
default=False,
"--base-image",
"-b",
help="Which base image to use for the Dockerfile.",
type=click.Choice(
list(BaseImageOptions._member_names_),
case_sensitive=False,
),
default="default",
)
def init(
pkg_name: str,
template: Optional[str] = None,
dockerfile: bool = False,
cuda: bool = False,
opencl: bool = False,
base_image: str = "default",
):
"""Initialize boilerplate for local workflow code."""

Expand All @@ -186,7 +184,7 @@ def init(

from latch_cli.services.init import init

created = init(pkg_name, template, dockerfile, cuda, opencl)
created = init(pkg_name, template, dockerfile, base_image)
if created:
click.secho(f"Created a latch workflow in `{pkg_name}`", fg="green")
click.secho("Run", fg="green")
Expand Down
63 changes: 63 additions & 0 deletions latch_cli/services/init/example_docker/assemble.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
import subprocess
from pathlib import Path

from latch import small_task
from latch.functions.messages import message
from latch.types import LatchFile, LatchOutputDir


@small_task
def assembly_task(
read1: LatchFile, read2: LatchFile, output_directory: LatchOutputDir
) -> LatchFile:

outdir = Path("/root/outputs").resolve()
outdir.mkdir(exist_ok=True)

bowtie2_cmd = [
"docker",
"run",
"--user",
"root",
"--env",
"BOWTIE2_INDEXES=/reference",
"--mount",
"type=bind,source=/root/reference,target=/reference",
"--mount",
f"type=bind,source={read1.local_path},target=/r1.fq",
"--mount",
f"type=bind,source={read2.local_path},target=/r2.fq",
"--mount",
f"type=bind,source={outdir},target=/outputs",
"biocontainers/bowtie2:v2.4.1_cv1",
"bowtie2",
"--local",
"--very-sensitive-local",
"-x",
"wuhan",
"-1",
"/r1.fq",
"-2",
"/r2.fq",
"-S",
"/outputs/covid_assembly.sam",
]

try:
# When using shell=True, we pass the entire command as a single string as
# opposed to a list since the shell will parse the string into a list
# using its own rules.
subprocess.run(" ".join(bowtie2_cmd), shell=True, check=True)
except subprocess.CalledProcessError as e:
message(
"error",
{"title": "Bowtie2 Failed", "body": f"Error: {str(e)}"},
)
raise e

# intended output path of the file in Latch console, constructed from
# the user provided output directory
output_location = f"{output_directory.remote_directory}/covid_assembly.sam"
local_sam_file = outdir / "covid_assembly.sam"

return LatchFile(str(local_sam_file), output_location)
Loading

0 comments on commit d8da2cc

Please sign in to comment.