Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tidy documentation. #186

Merged
merged 1 commit into from
Oct 22, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 7 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,17 @@ It stores queues for users/projects with pod specifications and creates these po

## Documentation

- [Design docs](./docs/design.md)
- [Development guide](./docs/developer.md)
- [Installation & Usage](./docs/usage.md)
- [Design Documentation](./docs/design.md)
- [Development Guide](./docs/developer.md)
- [User Guide](./docs/user.md)
- [Installation in Production](./docs/production-install.md)

## Key features
- Armada maintains fair resource share over time (inspired by HTCondor priority)
- It can handle large amount of queued jobs (million+)
- It Allows adding and removing clusters from the system without disruption
- By utilizing multiple Kubernetes clusters system can scale to larger amount of nodes beyond one cluster limits



## Key concepts

**Queue:** Represent user or project, used to maintain fair share over time, has priority factor
Expand All @@ -31,7 +30,7 @@ It stores queues for users/projects with pod specifications and creates these po

## Try it out locally

Assumming you have go installed.
Assuming you have go installed.

1. Clone repository & Build (project requires go & docker installed)
```bash
Expand Down Expand Up @@ -62,12 +61,12 @@ docker run -d --expose=6379 --network=host redis
```

6. Start executors for each cluster each in separate terminal
```
```bash
KUBECONFIG=$(kind get kubeconfig-path --name="demoA") ARMADA_APPLICATION_CLUSTERID=demoA ./bin/executor
KUBECONFIG=$(kind get kubeconfig-path --name="demoB") ARMADA_APPLICATION_CLUSTERID=demoB ./bin/executor
```
7. Create queue & Submit job
```
```bash
./bin/armadactl create-queue test 1
./bin/armadactl submit ./example/jobs.yaml
./bin/armadactl watch job-set-1
Expand Down
15 changes: 7 additions & 8 deletions docs/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,30 +22,29 @@ All jobs are grouped into Job Sets with user specified identifier. Job set repre
### Queue
All jobs needs to be placed into queues. Resources allocation is controlled using queues.

Queues has its own priority (lower number makes queue more important). Queue current priority is calculated from combination of resources used by jobs from the queue over time and queue priority. Current priority is used to decide which jobs to run first.
**Queue Current Priority**: Current priority is calculated from resource usage of jobs in the queue. This number approaches the amount of resource used by queue with configurable speed by `priorityHalfTime` configuration. If the queue priority is `A` and queue is using `B` amount of resource after time defined by `priorityHalfTime` the new priority will be `A + (B - A) / 2`.

Usual setup maps users or teams one to one to queues to control resource usage.
**Queue Priority Factor**: Each queue has priority factor which determines how important the queue is (lower number makes queue more important).

To achieve fairness between users we have implemented a HTCondor like algorithm to divide resources. Each queue has a priority. When pods from a queue use some resources over time, queue priority is reduced so other queues will get more share in the future. When queues do not use resources their priority will eventually get back to initial value.
**Queue Effective Priority** = **Queue Priority Factor** * **Queue Current Priority**

To achieve fairness between queues, when Armada schedules jobs resources are divided based on Queue Effective Priority.

## Proposed design
![Diagram](./batch-api.svg)

### Cluster Executor
Cluster executor is component running on each Kubernetes cluster. It keeps all pods and nodes information in memory and manages jobs within the cluster.
It proactively reports current state of the cluster and asks for jobs to run.
Queue Usage is recorded in database and used to update priorities of individual queues.
Executor can also refuse to execute assigned job and return it.

### Armada server
Armada server is central component which manages queues of jobs.
It stores all active jobs in Job database (current implementation use Redis).

#### Accounting
Executor periodically reports resource usage details to Armada server.
Usage is recorded in database and used to update priorities of individual queues.

#### Job Leasing
Executor periodically ask server for jobs to run reporting available resources. Armada distributes these available resources among queues according to queue current priority.
Executor periodically ask server for jobs to run reporting available resources. Armada distributes these available resources among queues according to Queue Effective Priority.
Jobs are taken from the top of each queue until the available resources is filled. These jobs are then returned to the executor to be executed on the cluster and marked as Leased with a timestamp to show when the lease began.

The executor must regularly renew the lease of all jobs it leases, otherwise leases expire and jobs will be considered failed and executed on different cluster.
Expand Down
79 changes: 33 additions & 46 deletions docs/developer.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
* [gRPC](#grpc)
* [Command line tools](#command-line-tools)

# Getting Started
# Getting started

There are many ways you can setup you local environment, this is just a basic quick example of how to setup everything you'll need to get started running and developing Armada.

Expand All @@ -17,74 +17,61 @@ To follow this section I am assuming you have:
* Docker installed (ideally in the sudo group)
* This repository cloned. The guide will assume you are in the root directory of this repository

#### Steps

1. Set up a Kubernetes cluster, this can be a local instance such as Kind (https://github.com/kubernetes-sigs/kind)
* For Kind simply run `GO111MODULE="on" go get sigs.k8s.io/kind@v0.5.1 && kind create cluster`
2. Put the kubernetes config file so your kubectl can find it. %HOME/.kube/config or set env variable KUBECONFIG=/config/file/location
* If using Kind, you can find the config file location by running command: `kind get kubeconfig-path`
3. Start redis with default values `docker run -d --expose=6379 --network=host redis`
* You may need to run this as sudo
4. In separate terminals run:
* go run /cmd/armada/main.go
* go run /cmd/executor/main.go

You now have Armada setup and can submit jobs to it, see [here](usage.md#submitting-jobs).

Likely you'll want to run the last steps via an IDE to make developing easier, so you can benefit from debug features etc.


#### Multi cluster Kind
### Running Armada locally

It is possible to develop Armada locally with [kind](https://github.com/kubernetes-sigs/kind) Kubernetes clusters.

1. Get kind
```bash
# Download Kind
go get sigs.k8s.io/kind

# create 2 clusters
```
2. Create kind clusters (you can create any number of clusters)
```bash
kind create cluster --name demoA --config ./example/kind-config.yaml
kind create cluster --name demoB --config ./example/kind-config.yaml

# run armada
```
3. Start Redis
```bash
docker run -d --expose=6379 --network=host redis
```
4. Start server in one terminal
```bash
go run ./cmd/armada/main.go

# run executors for each cluster
```
5. Start executors for each cluster each in separate terminal
```bash
KUBECONFIG=$(kind get kubeconfig-path --name="demoA") ARMADA_APPLICATION_CLUSTERID=demoA go run ./cmd/executor/main.go
KUBECONFIG=$(kind get kubeconfig-path --name="demoB") ARMADA_APPLICATION_CLUSTERID=demoB go run ./cmd/executor/main.go
```

Depending on your docker setup you might need to load images for jobs you plan to run manually
6. Create queue & Submit job
```bash
kind load docker-image busybox:latest
```

#### Using Armada locally

The most basic example would be:

```
# Create queue
go run ./cmd/armadactl/main.go create-queue test 1

# Submit example job
go run ./cmd/armadactl/main.go submit ./example/jobs.yaml

# Watch events of example job
go run ./cmd/armadactl/main.go watch job-set-1
go run ./cmd/armadactl/main.go watch job-set-1
```

For more details on submitting jobs jobs to Armada, see [here](usage.md#submitting-jobs).

Once you submit jobs, you should be able to see pods appearing in your cluster(s), running what you submitted.


#### Troubleshooting
**Note:** Depending on your docker setup you might need to load images for jobs you plan to run manually
```bash
kind load docker-image busybox:latest
```

* If the executor component is failing to contact kubernetes
* Make sure your config file is placed in the correct place
* You can test it by checking you can use Kubectl to access your cluster. The executor should be looking in the same place as Kubectl
### Running tests
For unit tests run
```bash
make tests
```

For end to end tests run:
```bash
make tests-e2e
# optionally stop kubernetes cluster which was started by test
make e2e-stop-cluster
```

## Code Generation

Expand Down
19 changes: 0 additions & 19 deletions docs/metrics.md

This file was deleted.

Loading