Skip to content

Commit

Permalink
Split everything left out of the README (#1180)
Browse files Browse the repository at this point in the history
If only doc changes were included in changelog...
  • Loading branch information
whatyouhide authored Nov 12, 2024
1 parent 92b027d commit 899c31c
Show file tree
Hide file tree
Showing 9 changed files with 379 additions and 435 deletions.
464 changes: 41 additions & 423 deletions README.md

Large diffs are not rendered by default.

21 changes: 21 additions & 0 deletions guides/clustering.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Clustering

Oban supports running in clusters of nodes. It supports both nodes that are connected to each
other (via *distributed Erlang*), as well as nodes that are not connected to each other but that
communicate via the database's pub/sub mechanism.

Usually, scheduled job management operates in **global mode** and notifies queues of available
jobs via pub/sub to minimize database load. However, when pub/sub isn't available, staging
switches to a **local mode** where each queue polls independently.

Local mode is less efficient and will only happen if you're running in an environment where
neither PostgreSQL nor PG notifications work. That situation should be rare and limited to the
following conditions:

1. Running with a connection pooler, like [pg_bouncer], in transaction mode.
2. Running without clustering, that is, without *distributed Erlang*.

If **both** of those criteria apply and pub/sub notifications won't work, then
staging will switch to polling in local mode.

[pg_bouncer]: http://www.pgbouncer.org
51 changes: 51 additions & 0 deletions guides/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Configuration

This page details generic configuration options.

## Configuring Queues

You can define queues as a keyword list where the key is the name of the queue and the value is
the maximum number of concurrent jobs. The following configuration would start four queues with
concurrency ranging from 5 to 50:

```elixir
config :my_app, Oban,
queues: [default: 10, mailers: 20, events: 50, media: 5],
repo: MyApp.Repo
```

You may also use an expanded form to configure queues with individual overrides:

```elixir
queues: [
default: 10,
events: [limit: 50, paused: true]
]
```

The `events` queue will now start in a paused state, which means it won't process anything until
`Oban.resume_queue/2` is called to start it.

There isn't a limit to the number of queues or how many jobs may execute
concurrently in each queue. Some additional guidelines:

* Each queue will run as many jobs as possible concurrently, up to the configured limit. Make
sure your system has enough resources (such as *database connections*) to handle the concurrent
load.

* Queue limits are **local** (per-node), not global (per-cluster). For example, running a queue
with a local limit of `2` on three separate nodes is effectively a global limit of *six
concurrent jobs*. If you require a global limit, you must restrict the number of nodes running a
particular queue or consider Oban Pro's [Smart Engine][smart], which can manage global
concurrency *automatically*!
* Only jobs in the configured queues will execute. Jobs in any other queue will
stay in the database untouched.

* Pay attention to the number of concurrent jobs making expensive system calls (such as calls to
resource-intensive tools like [FFMpeg][ffmpeg] or [ImageMagick][imagemagick]). The BEAM ensures
that the system stays responsive under load, but those guarantees don't apply when using ports
or shelling out commands.

[ffmpeg]: https://www.ffmpeg.org
[imagemagick]: https://imagemagick.org/index.php
[smart]: https://oban.pro/docs/pro/Oban.Pro.Engines.Smart.html
136 changes: 136 additions & 0 deletions guides/job_uniqueness.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Job Uniqueness

The *uniqueness* of a job is a somewhat complex topic. This guide is here to help you understand its complexities!

The unique jobs feature allows you to specify constraints to prevent *enqueuing* duplicate jobs.
These constraints only apply when jobs are inserted. Uniqueness has no bearing on whether jobs
are *executed* concurrently.
Uniqueness is based on a combination of job attributes based on the following options:

* `:period` — The number of seconds until a job is no longer considered duplicate. You should
always specify a period, otherwise Oban will default to 60 seconds. `:infinity` can be used to
indicate the job be considered a duplicate as long as jobs are retained (see
`Oban.Plugins.Pruner`).

* `:fields` — The fields to compare when evaluating uniqueness. The available fields are
`:args`, `:queue`, `:worker`, and `:meta`. `:fields` defaults to `[:worker, :queue, :args]`.
It's recommended that you leave the default `:fields`, otherwise you risk unexpected conflicts
between unrelated jobs.

* `:keys` — A specific subset of the `:args` or `:meta` to consider when comparing against
historic jobs. This allows a job with multiple key/value pairs in its arguments to be compared
using only a subset of them.

* `:states` — The job states that are checked for duplicates. The available states are
described in `t:Oban.Job.unique_state/0`. By default, Oban checks all states except for
`:discarded` and `:cancelled`, which prevents duplicates even if the previous job has been
completed.

* `:timestamp` — Which job timestamp to check the period against. The available timestamps are
`:inserted_at` or `:scheduled_at`. Defaults to `:inserted_at` for legacy reasons.

The simplest form of uniqueness will configure uniqueness for as long as a matching job exists in
the database, regardless of state:

```elixir
use Oban.Worker, unique: true
```

Here's a more complex example which uses multiple options:

```elixir
use Oban.Worker,
unique: [
# Jobs should be unique for 2 minutes...
period: {2, :minutes},
# ...after being scheduled, not inserted
timestamp: :scheduled_at,
# Don't consider the whole :args field, but just the :url field within :args
keys: [:url],
# Consider a job unique across all states, including :cancelled/:discarded
states: Oban.Job.states(),
# Consider a job unique across queues; only compare the :url key within
# the :args, as per the :keys configuration above
fields: [:worker, :args]
]
```

## Detecting Unique Conflicts

When unique settings match an existing job, the return value of `Oban.insert/2` is still `{:ok,
job}`. However, you can detect a unique conflict by checking the job's `:conflict?` field. If
there was an existing job, the field is `true`; otherwise it is `false`.

You can use the `:conflict?` field to customize responses after insert:

```elixir
case Oban.insert(changeset) do
{:ok, %Job{id: nil, conflict?: true}} ->
{:error, :failed_to_acquire_lock}

{:ok, %Job{conflict?: true}} ->
{:error, :job_already_exists}

result ->
result
end
```

> #### Caveat with `insert_all` {: .warning}
>
> Unless you are using Oban Pro's [Smart Engine][pro-smart-engine], Oban only detects conflicts
> for jobs enqueued through [`Oban.insert/2,3`](`Oban.insert/2`). When using the [Basic
> Engine](`Oban.Engines.Basic`), jobs enqueued through `Oban.insert_all/2` *do not* use per-job
> unique configuration.
## Replacing Values

In addition to detecting unique conflicts, passing options to `:replace` can update any job field
when there is a conflict. Any of the following fields can be replaced per *state*:

* `:args`
* `:max_attempts`
* `:meta`
* `:priority`
* `:queue`
* `:scheduled_at`
* `:tags`
* `:worker`

For example, to change the `:priority` and increase `:max_attempts` when there is a conflict with
a job in a `:scheduled` state:

```elixir
BusinessWorker.new(
args,
max_attempts: 5,
priority: 0,
replace: [scheduled: [:max_attempts, :priority]]
)
```

Another example is bumping the scheduled time on conflict. Either `:scheduled_at` or
`:schedule_in` values will work, but the replace option is always `:scheduled_at`.

```elixir
UrgentWorker.new(args, schedule_in: 1, replace: [scheduled: [:scheduled_at]])
```

> #### Jobs in the `:executing` State {: .error}
>
> If you use this feature to replace a field (such as `:args`) in the `:executing` state by doing
> something like
>
> ```elixir
> UniqueWorker.new(new_args, replace: [executing: [:args]])
> ```
>
> then Oban will update `:args`, but the job will continue executing with the original value.
## Unique Guarantees
Oban **strives** for uniqueness of jobs through transactional locks and database queries.
Uniqueness *does not* rely on unique constraints in the database, which leaves it prone to race
conditions in some circumstances.
[pro-smart-engine]: https://oban.pro/docs/pro/Oban.Pro.Engines.Smart.html
28 changes: 28 additions & 0 deletions guides/operational_maintenance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Operational Maintenance

This guide walks you through *maintaining* a production Oban setup from an operational
perspective.

## Pruning Historic Jobs

Job stats and queue introspection are built on *keeping job rows in the database* after they have
completed. This allows administrators to review completed jobs and build informative aggregates,
at the expense of storage and an unbounded table size. To prevent the `oban_jobs` table from
growing indefinitely, Oban actively prunes `completed`, `cancelled`, and `discarded` jobs.

By default, the [pruner plugin](`Oban.Plugins.Pruner`) retains jobs for 60 seconds. You can
configure a longer retention period by providing a `:max_age` in seconds to the pruner plugin.

```elixir
config :my_app, Oban,
plugins: [{Oban.Plugins.Pruner, max_age: _5_minutes_in_seconds = 300}],
# ...
```

> #### Caveats & Guidelines {: .info}
>
> * Pruning is best-effort and performed out-of-band. This means that all limits are soft; jobs
> beyond a specified age may not be pruned immediately after jobs complete.
>
> * Pruning is only applied to jobs that are `completed`, `cancelled`, or `discarded`. It'll never
> delete a new job, a scheduled job, or a job that failed and will be retried.
29 changes: 29 additions & 0 deletions guides/scheduling_jobs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Scheduling Jobs

You can **schedule** jobs down to the second any time in the future:

```elixir
%{id: 1}
|> MyApp.Business.new(schedule_in: _seconds = 5)
|> Oban.insert()
```

Jobs may also be scheduled at a *specific timestamp* in the future:

```elixir
%{id: 1}
|> MyApp.Business.new(scheduled_at: ~U[2020-12-25 19:00:56.0Z])
|> Oban.insert()
```

Scheduling is *always* in UTC. You'll have to shift timestamps in other zones to UTC before
scheduling:

```elixir
%{id: 1}
|> MyApp.Business.new(scheduled_at: DateTime.shift_zone!(datetime, "Etc/UTC"))
|> Oban.insert()
```

Scheduling works across a cluster of nodes. See the [*Clustering* guide](clustering.html) for more
information.
5 changes: 3 additions & 2 deletions lib/oban/plugins/pruner.ex
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
defmodule Oban.Plugins.Pruner do
@moduledoc """
Periodically delete `completed`, `cancelled` and `discarded` jobs based on their age.
Periodically delete `completed`, `cancelled`, and `discarded` jobs based on their age.
Pruning is critical for maintaining table size and continued responsive job processing. It
is recommended for all production applications.
is recommended for all production applications. See also the
[*Operational Maintenance* guide](operational_maintenance.html).
> #### 🌟 DynamicPruner {: .info}
>
Expand Down
Loading

0 comments on commit 899c31c

Please sign in to comment.