Split everything left out of the README (#1180)

If only doc changes were included in changelog...
oban-bg · Nov 12, 2024 · 899c31c · 899c31c
1 parent 92b027d
commit 899c31c
Show file tree

Hide file tree

Showing 9 changed files with 379 additions and 435 deletions.
diff --git a/README.md b/README.md
diff --git a/guides/clustering.md b/guides/clustering.md
@@ -0,0 +1,21 @@
+# Clustering
+
+Oban supports running in clusters of nodes. It supports both nodes that are connected to each
+other (via *distributed Erlang*), as well as nodes that are not connected to each other but that
+communicate via the database's pub/sub mechanism.
+
+Usually, scheduled job management operates in **global mode** and notifies queues of available
+jobs via pub/sub to minimize database load. However, when pub/sub isn't available, staging
+switches to a **local mode** where each queue polls independently.
+
+Local mode is less efficient and will only happen if you're running in an environment where
+neither PostgreSQL nor PG notifications work. That situation should be rare and limited to the
+following conditions:
+
+  1. Running with a connection pooler, like [pg_bouncer], in transaction mode.
+  2. Running without clustering, that is, without *distributed Erlang*.
+
+If **both** of those criteria apply and pub/sub notifications won't work, then
+staging will switch to polling in local mode.
+
+[pg_bouncer]: http://www.pgbouncer.org
diff --git a/guides/configuration.md b/guides/configuration.md
@@ -0,0 +1,51 @@
+# Configuration
+
+This page details generic configuration options.
+
+## Configuring Queues
+
+You can define queues as a keyword list where the key is the name of the queue and the value is
+the maximum number of concurrent jobs. The following configuration would start four queues with
+concurrency ranging from 5 to 50:
+
+```elixir
+config :my_app, Oban,
+  queues: [default: 10, mailers: 20, events: 50, media: 5],
+  repo: MyApp.Repo
+```
+
+You may also use an expanded form to configure queues with individual overrides:
+
+```elixir
+queues: [
+  default: 10,
+  events: [limit: 50, paused: true]
+]
+```
+
+The `events` queue will now start in a paused state, which means it won't process anything until
+`Oban.resume_queue/2` is called to start it.
+
+There isn't a limit to the number of queues or how many jobs may execute
+concurrently in each queue. Some additional guidelines:
+
+  * Each queue will run as many jobs as possible concurrently, up to the configured limit. Make
+  sure your system has enough resources (such as *database connections*) to handle the concurrent
+  load.
+
+  * Queue limits are **local** (per-node), not global (per-cluster). For example, running a queue
+  with a local limit of `2` on three separate nodes is effectively a global limit of *six
+  concurrent jobs*. If you require a global limit, you must restrict the number of nodes running a
+  particular queue or consider Oban Pro's [Smart Engine][smart], which can manage global
+  concurrency *automatically*!
+  * Only jobs in the configured queues will execute. Jobs in any other queue will
+  stay in the database untouched.
+
+  * Pay attention to the number of concurrent jobs making expensive system calls (such as calls to
+  resource-intensive tools like [FFMpeg][ffmpeg] or [ImageMagick][imagemagick]). The BEAM ensures
+  that the system stays responsive under load, but those guarantees don't apply when using ports
+  or shelling out commands.
+
+[ffmpeg]: https://www.ffmpeg.org
+[imagemagick]: https://imagemagick.org/index.php
+[smart]: https://oban.pro/docs/pro/Oban.Pro.Engines.Smart.html
diff --git a/guides/job_uniqueness.md b/guides/job_uniqueness.md
@@ -0,0 +1,136 @@
+# Job Uniqueness
+
+The *uniqueness* of a job is a somewhat complex topic. This guide is here to help you understand its complexities!
+
+The unique jobs feature allows you to specify constraints to prevent *enqueuing* duplicate jobs.
+These constraints only apply when jobs are inserted. Uniqueness has no bearing on whether jobs
+are *executed* concurrently.
+Uniqueness is based on a combination of job attributes based on the following options:
+
+  * `:period` — The number of seconds until a job is no longer considered duplicate. You should
+    always specify a period, otherwise Oban will default to 60 seconds. `:infinity` can be used to
+    indicate the job be considered a duplicate as long as jobs are retained (see
+    `Oban.Plugins.Pruner`).
+
+  * `:fields` — The fields to compare when evaluating uniqueness. The available fields are
+    `:args`, `:queue`, `:worker`, and `:meta`. `:fields` defaults to `[:worker, :queue, :args]`.
+    It's recommended that you leave the default `:fields`, otherwise you risk unexpected conflicts
+    between unrelated jobs.
+
+  * `:keys` — A specific subset of the `:args` or `:meta` to consider when comparing against
+    historic jobs. This allows a job with multiple key/value pairs in its arguments to be compared
+    using only a subset of them.
+
+  * `:states` — The job states that are checked for duplicates. The available states are
+    described in `t:Oban.Job.unique_state/0`. By default, Oban checks all states except for
+    `:discarded` and `:cancelled`, which prevents duplicates even if the previous job has been
+    completed.
+
+  * `:timestamp` — Which job timestamp to check the period against. The available timestamps are
+    `:inserted_at` or `:scheduled_at`. Defaults to `:inserted_at` for legacy reasons.
+
+The simplest form of uniqueness will configure uniqueness for as long as a matching job exists in
+the database, regardless of state:
+
+```elixir
+use Oban.Worker, unique: true
+```
+
+Here's a more complex example which uses multiple options:
+
+```elixir
+use Oban.Worker,
+  unique: [
+    # Jobs should be unique for 2 minutes...
+    period: {2, :minutes},
+    # ...after being scheduled, not inserted
+    timestamp: :scheduled_at,
+    # Don't consider the whole :args field, but just the :url field within :args
+    keys: [:url],
+    # Consider a job unique across all states, including :cancelled/:discarded
+    states: Oban.Job.states(),
+    # Consider a job unique across queues; only compare the :url key within
+    # the :args, as per the :keys configuration above
+    fields: [:worker, :args]
+  ]
+```
+
+## Detecting Unique Conflicts
+
+When unique settings match an existing job, the return value of `Oban.insert/2` is still `{:ok,
+job}`. However, you can detect a unique conflict by checking the job's `:conflict?` field. If
+there was an existing job, the field is `true`; otherwise it is `false`.
+
+You can use the `:conflict?` field to customize responses after insert:
+
+```elixir
+case Oban.insert(changeset) do
+  {:ok, %Job{id: nil, conflict?: true}} ->
+    {:error, :failed_to_acquire_lock}
+
+  {:ok, %Job{conflict?: true}} ->
+    {:error, :job_already_exists}
+
+  result ->
+    result
+end
+```
+
+> #### Caveat with `insert_all` {: .warning}
+>
+> Unless you are using Oban Pro's [Smart Engine][pro-smart-engine], Oban only detects conflicts
+> for jobs enqueued through [`Oban.insert/2,3`](`Oban.insert/2`). When using the [Basic
+> Engine](`Oban.Engines.Basic`), jobs enqueued through `Oban.insert_all/2` *do not* use per-job
+> unique configuration.
+
+## Replacing Values
+
+In addition to detecting unique conflicts, passing options to `:replace` can update any job field
+when there is a conflict. Any of the following fields can be replaced per *state*:
+
+  * `:args`
+  * `:max_attempts`
+  * `:meta`
+  * `:priority`
+  * `:queue`
+  * `:scheduled_at`
+  * `:tags`
+  * `:worker`
+
+For example, to change the `:priority` and increase `:max_attempts` when there is a conflict with
+a job in a `:scheduled` state:
+
+```elixir
+BusinessWorker.new(
+  args,
+  max_attempts: 5,
+  priority: 0,
+  replace: [scheduled: [:max_attempts, :priority]]
+)
+```
+
+Another example is bumping the scheduled time on conflict. Either `:scheduled_at` or
+`:schedule_in` values will work, but the replace option is always `:scheduled_at`.
+
+```elixir
+UrgentWorker.new(args, schedule_in: 1, replace: [scheduled: [:scheduled_at]])
+```
+
+> #### Jobs in the `:executing` State {: .error}
+>
+> If you use this feature to replace a field (such as `:args`) in the `:executing` state by doing
+> something like
+>
+> ```elixir
+> UniqueWorker.new(new_args, replace: [executing: [:args]])
+> ```
+>
+> then Oban will update `:args`, but the job will continue executing with the original value.
+
+## Unique Guarantees
+
+Oban **strives** for uniqueness of jobs through transactional locks and database queries.
+Uniqueness *does not* rely on unique constraints in the database, which leaves it prone to race
+conditions in some circumstances.
+
+[pro-smart-engine]: https://oban.pro/docs/pro/Oban.Pro.Engines.Smart.html
diff --git a/guides/operational_maintenance.md b/guides/operational_maintenance.md
@@ -0,0 +1,28 @@
+# Operational Maintenance
+
+This guide walks you through *maintaining* a production Oban setup from an operational
+perspective.
+
+## Pruning Historic Jobs
+
+Job stats and queue introspection are built on *keeping job rows in the database* after they have
+completed. This allows administrators to review completed jobs and build informative aggregates,
+at the expense of storage and an unbounded table size. To prevent the `oban_jobs` table from
+growing indefinitely, Oban actively prunes `completed`, `cancelled`, and `discarded` jobs.
+
+By default, the [pruner plugin](`Oban.Plugins.Pruner`) retains jobs for 60 seconds. You can
+configure a longer retention period by providing a `:max_age` in seconds to the pruner plugin.
+
+```elixir
+config :my_app, Oban,
+  plugins: [{Oban.Plugins.Pruner, max_age: _5_minutes_in_seconds = 300}],
+  # ...
+```
+
+> #### Caveats & Guidelines {: .info}
+>
+> * Pruning is best-effort and performed out-of-band. This means that all limits are soft; jobs
+> beyond a specified age may not be pruned immediately after jobs complete.
+>
+> * Pruning is only applied to jobs that are `completed`, `cancelled`, or `discarded`. It'll never
+> delete a new job, a scheduled job, or a job that failed and will be retried.
diff --git a/guides/scheduling_jobs.md b/guides/scheduling_jobs.md
@@ -0,0 +1,29 @@
+# Scheduling Jobs
+
+You can **schedule** jobs down to the second any time in the future:
+
+```elixir
+%{id: 1}
+|> MyApp.Business.new(schedule_in: _seconds = 5)
+|> Oban.insert()
+```
+
+Jobs may also be scheduled at a *specific timestamp* in the future:
+
+```elixir
+%{id: 1}
+|> MyApp.Business.new(scheduled_at: ~U[2020-12-25 19:00:56.0Z])
+|> Oban.insert()
+```
+
+Scheduling is *always* in UTC. You'll have to shift timestamps in other zones to UTC before
+scheduling:
+
+```elixir
+%{id: 1}
+|> MyApp.Business.new(scheduled_at: DateTime.shift_zone!(datetime, "Etc/UTC"))
+|> Oban.insert()
+```
+
+Scheduling works across a cluster of nodes. See the [*Clustering* guide](clustering.html) for more
+information.
diff --git a/lib/oban/plugins/pruner.ex b/lib/oban/plugins/pruner.ex
@@ -1,9 +1,10 @@
 defmodule Oban.Plugins.Pruner do
   @moduledoc """
-  Periodically delete `completed`, `cancelled` and `discarded` jobs based on their age.
+  Periodically delete `completed`, `cancelled`, and `discarded` jobs based on their age.
 
   Pruning is critical for maintaining table size and continued responsive job processing. It
-  is recommended for all production applications.
+  is recommended for all production applications. See also the
+  [*Operational Maintenance* guide](operational_maintenance.html).
 
   > #### 🌟 DynamicPruner {: .info}
   >