Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi scheduler support #17

Merged
merged 41 commits into from
Jan 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
e7eaff9
Started with multi scheduler suppport, first build schedulers from ya…
sverhoeven Nov 18, 2022
fb90268
Add exmple config file
sverhoeven Nov 18, 2022
89345d8
Added config parser for applications, schedulers and file systems.
sverhoeven Nov 21, 2022
0566939
Fix flake8 & mypy errors
sverhoeven Nov 21, 2022
9c2dad2
Read and use config.yaml
sverhoeven Nov 21, 2022
8e25880
Move stuff around
sverhoeven Nov 22, 2022
ca05fc4
Move job root dir to config + default for destination.filesystem
sverhoeven Nov 22, 2022
5e053c1
Standardize ssh config naming
sverhoeven Nov 22, 2022
9c4a741
Add destination picker to config
sverhoeven Nov 22, 2022
c785710
Use import_module
sverhoeven Nov 23, 2022
f3c7738
Added round robin picker
sverhoeven Jan 3, 2023
7c0425f
Split config into config and context
sverhoeven Jan 9, 2023
879c1b3
Dont test default job_root_dir
sverhoeven Jan 9, 2023
d77bfaf
Create tables before starting app
sverhoeven Jan 10, 2023
24fa549
Each destination needs a filesystem
sverhoeven Jan 10, 2023
e03ddf2
Add GET /api/application/{application} route
sverhoeven Jan 12, 2023
ee4160e
Add example config file to run haddock3 commands
sverhoeven Jan 12, 2023
b57ba0f
Move /bartender/tests to /tests + /bartender to /src/bartender
sverhoeven Jan 13, 2023
8e97b4f
Add missing src/bartender/db/__init__.py
sverhoeven Jan 13, 2023
f816a5b
Rename src/bartender/_ssh_utils.py -> src/bartender/ssh_utils.py
sverhoeven Jan 13, 2023
60309b9
Add Sphinx doc site
sverhoeven Jan 13, 2023
be87b44
Merge remote-tracking branch 'origin/main' into multi-scheduler
sverhoeven Jan 13, 2023
58679bd
Merge remote-tracking branch 'origin/main' into multi-scheduler
sverhoeven Jan 13, 2023
a36c4a5
Use picker module
sverhoeven Jan 13, 2023
030734e
Use localized path on remote scheduler + download when scheduler says…
sverhoeven Jan 13, 2023
6461be3
Tell poetry about src layout
sverhoeven Jan 16, 2023
8cbad0d
Added TODOs
sverhoeven Jan 16, 2023
8bc4f83
Update dependencies
sverhoeven Jan 16, 2023
4d0ee08
Picker now in own module
sverhoeven Jan 16, 2023
95db1ab
Merge branch 'slurm-scheduler' into multi-scheduler
sverhoeven Jan 24, 2023
d346fd4
Merge remote-tracking branch 'origin/main' into multi-scheduler
sverhoeven Jan 24, 2023
e5570da
Improve picker docs
sverhoeven Jan 27, 2023
df7da14
Simplify test names
sverhoeven Jan 27, 2023
f13d4bb
Prefer BaseModel over pydantic.dataclass or dataclass.dataclass
sverhoeven Jan 27, 2023
673a18e
Right angle
sverhoeven Jan 27, 2023
86e2571
Dont use BaseModel for class with abstract classes as attributes
sverhoeven Jan 27, 2023
ba1f341
Where config value must be
sverhoeven Jan 27, 2023
ebc4fd1
Update src/bartender/destinations.py
sverhoeven Jan 27, 2023
c069295
Dont set default job root dir as it is not validated
sverhoeven Jan 27, 2023
2cef6ed
Use pydantic to validate default value
sverhoeven Jan 27, 2023
e2e0cd6
Merge branch 'multi-scheduler' of github.com:i-VRESSE/bartender into …
sverhoeven Jan 27, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ instance/

# Sphinx documentation
docs/_build/
docs/autoapi/

# PyBuilder
.pybuilder/
Expand Down
3 changes: 2 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,5 @@ repos:
types: [python]
pass_filenames: false
args:
- "bartender"
- "src"
- "tests"
122 changes: 94 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
- [Project structure](#project-structure)
- [Configuration](#configuration)
- [Applications](#applications)
- [Job destinations](#job-destinations)
- [Destination picker](#destination-picker)
- [User management](#user-management)
- [GitHub login](#github-login)
- [Orcid sandbox login](#orcid-sandbox-login)
Expand All @@ -22,6 +24,8 @@
- [Reverting migrations](#reverting-migrations)
- [Migration generation](#migration-generation)
- [Running tests](#running-tests)
- [Documentation](#documentation)
- [Build](#build)

***

Expand Down Expand Up @@ -57,11 +61,7 @@ This project was generated using [fastapi_template](https://github.com/s3rius/Fa
poetry install
```

5. Run the application and the database

```bash
bartender serve
```
5. Run the database for storing users and jobs.

Important: **In another terminal**

Expand All @@ -74,13 +74,19 @@ This project was generated using [fastapi_template](https://github.com/s3rius/Fa
postgres:13.6-bullseye
```

6. Migrate the database
6. Create tables in the database

```bash
alembic upgrade "head"
```

7. Go to the interactive API documentation generated by FastAPI
7. Run the application

```bash
bartender serve
```

8. Go to the interactive API documentation generated by FastAPI

<http://localhost:8000/api/docs>

Expand All @@ -89,27 +95,34 @@ This project was generated using [fastapi_template](https://github.com/s3rius/Fa
## [Project structure](#project-structure)

```bash
$ tree "bartender"
bartender
├── db # module contains db configurations
│   ├── dao # Data Access Objects. Contains different classes to interact with database.
│   └── models # Package contains different models for ORMs.
├── __main__.py # Startup script. Starts uvicorn.
├── services # Package for different external services such as rabbit or redis etc.
├── settings.py # Main configuration settings for project.
├── static # Static content.
├── tests # Tests for project.
└── conftest.py # Fixtures for all tests.
└── web # Package contains web server. Handlers, startup config.
├── api # Package with all handlers.
│   └── router.py # Main router.
├── application.py # FastAPI application configuration.
└── lifetime.py # Contains actions to perform on startup and shutdown.
$ tree .
├── tests # Tests for project.
│ └── conftest.py # Fixtures for all tests.
├── docs # Documentatin for project.
| ├── index.rst # Main documentation page.
│ └── conf.py # Sphinx config file.
└── src
└── bartender
├── db # module contains db configurations
│   ├── dao # Data Access Objects. Contains different classes to interact with database.
│   └── models # Package contains different models for ORMs.
├── __main__.py # Startup script. Starts uvicorn.
├── services # Package for different external services such as rabbit or redis etc.
├── settings.py # Main configuration settings for project.
├── static # Static content.
└── web # Package contains web server. Handlers, startup config.
├── api # Package with all handlers.
│   └── router.py # Main router.
├── application.py # FastAPI application configuration.
└── lifetime.py # Contains actions to perform on startup and shutdown.
```

## [Configuration](#configuration)

This application can be configured with environment variables.
This application can be configured with environment variables and `config.yaml` file.
The environment variables are for FastAPI settings like http port and user management.
The `config.yaml` file is for non-FastAPI configuration like which [application can be submitted](#applications) and [where they should submitted](#job-destinations).
See [config-example.yaml](config-example.yaml) for example of a `config.yaml` file.

You can create `.env` file in the root directory and place all
environment variables here.
Expand All @@ -135,18 +148,53 @@ You can read more about BaseSettings class here: <https://pydantic-docs.helpmanu

Bartender accepts jobs for different applications.

Applications can be configured with the `BARTENDER_APPLICATIONS` environment variable.
Applications can be configured in the `config.yaml` file under `applications` key.

For example

```env
BARTENDER_APPLICATIONS='{"app1": {"command": "app1 $config", "config": "workflow.cfg"}, "app2": {"command": "app2 $config", "config": "workflow.cfg"}}'
```yaml
applications:
app1:
command: app1 $config
config: workflow.cfg
```

* The key is the name of the application
* The `config` key is the config file that must be present in the uploaded archived.
* The `command` key is the command executed in the directory of the unpacked archive that the consumer uploaded. The `$config` in command string will be replaced with value of the config key.

### [Job destinations](#job-destinations)

Bartender can run job in different destinations.

A destination is a combination of a scheduler and filesystem.
Supported schedulers
* memory, Scheduler which has queue in memory and can specified number of jobs (slots) concurrently.
* slurm, Scheduler which calls commands of [Slurm batch scheduler](https://slurm.schedmd.com/) on either local machine or remote machine via SSH.

Supported file systems
* local: Uploading or downloading of files does nothing
* sftp: Uploading or downloading of files is done using SFTP.

When the filesystem is on a remote system with non-shared file system or a different user) then
* the input files will be uploaded before submission to the scheduler and
* the output files will be downloaded after the job has completed.

Destinations can be configured in the `config.yaml` file under `destinations` key.
By default a single slot in-memory scheduler with a local filesystem is used.

### [Destination picker](#destination-picker)

If you have multiple applications and job destinations you need some way to specify to which job submission should go.

A Python function can be used to pick to which destination a job should go.

To use a custom picker function set `destination_picker` in `config.yaml` file.
The value should be formatted as `<module>:<function>`, for example to rotate over each destination use `bartender.picker.pick_round` as value.
The picker function should have type `bartender.picker.DestinationPicker`.

By default jobs are submitted to the first destination.

## [User management](#user-management)

For secure auth add `BARTENDER_SECRET=<some random string>` to `.env` file.
Expand Down Expand Up @@ -279,7 +327,7 @@ Use the following steps to run a job:
6. Retrieve result. The word count application (`wc`) outputs to the stdout.
1. Try out the `GET /api/job/{jobid}/stdout`
2. Use job identifier retrieved by submit request as `jobid` parameter value.
3. Should see something like `404 1556 12928 README.md`.
3. Should see something like `433 1793 14560 README.md`.
Where numbers are counts for newlines, words, bytes.

### Haddock3 example
Expand Down Expand Up @@ -423,3 +471,21 @@ To get a PostgreSQL terminal do
```bash
docker exec -ti <id or name of docker container> psql -U bartender
```

## [Documentation](#documentation)

### Build

First install dependencies with

```shell
poetry install --with docs
```

Build with
```shell
cd docs
make html
```

Creates documentation site at [docs/_build/html](docs/_build/html/index.html).
2 changes: 1 addition & 1 deletion alembic.ini
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[alembic]
script_location = bartender/db/migrations
script_location = src/bartender/db/migrations
file_template = %%(year)d-%%(month).2d-%%(day).2d-%%(hour).2d-%%(minute).2d_%%(rev)s
prepend_sys_path = .
output_encoding = utf-8
Expand Down
34 changes: 0 additions & 34 deletions bartender/_ssh_utils.py

This file was deleted.

12 changes: 0 additions & 12 deletions bartender/schedulers/dependencies.py

This file was deleted.

Empty file removed bartender/tests/web/__init__.py
Empty file.
30 changes: 0 additions & 30 deletions bartender/web/api/applications/submit.py

This file was deleted.

44 changes: 0 additions & 44 deletions bartender/web/api/job/sync.py

This file was deleted.

52 changes: 52 additions & 0 deletions config-example.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# By default the files of jobs are stored in /tmp/jobs
# job_root_dir: /tmp/jobs
# By default jobs are submitted to the first destination
# destination_picker: bartender.picker:pick_first
# To use a custom picker set `destination_picker` to a `<module>:<function>`
# The picker should have type bartender.picker.DestinationPicker .
sverhoeven marked this conversation as resolved.
Show resolved Hide resolved
applications:
# The label of the application
wc:
# Command line interface command to run application.
# `$config` occurences will be replaced value of config parameter.
command: wc $config
# Name of config file that the application needs
config: README.md
destinations:
local:
scheduler:
type: memory
slots: 1
filesystem:
type: local
## Example of running jobs by current user on snellius.
sverhoeven marked this conversation as resolved.
Show resolved Hide resolved
# remote:
# scheduler:
# type: slurm
# partition: thin
# ssh_config:
# hostname: snellius.surf.nl
# filesystem:
# type: sftp
# ssh_config:
# hostname: snellius.surf.nl
sverhoeven marked this conversation as resolved.
Show resolved Hide resolved
# entry: /scratch-shared/bartender/jobs
## Example of running jobs on a slurm Docker container.
## Start a container with `docker run --detach --publish 10022:22 xenonmiddleware/slurm:20`
# slurmcontainer:
# scheduler:
# type: slurm
# partition: mypartition
# ssh_config:
# port: 10022
# hostname: localhost
# username: xenon
# password: javagat
sverhoeven marked this conversation as resolved.
Show resolved Hide resolved
# filesystem:
# type: sftp
# ssh_config:
sverhoeven marked this conversation as resolved.
Show resolved Hide resolved
# port: 10022
# hostname: localhost
# username: xenon
# password: javagat
# entry: /home/xenon
Loading