Skip to content

Commit

Permalink
docs: Update Loki components page with new bloom components (#14169)
Browse files Browse the repository at this point in the history
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
(cherry picked from commit 50ad3df)
  • Loading branch information
chaudum authored and grafana-delivery-bot[bot] committed Sep 19, 2024
1 parent fb9ef6c commit 5b27947
Showing 1 changed file with 37 additions and 23 deletions.
60 changes: 37 additions & 23 deletions docs/sources/get-started/components.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ For more information see [Deployment modes]({{< relref "./deployment-modes" >}})
| [Index Gateway](#index-gateway) | x | | | | x |
| [Compactor](#compactor) | x | x | | | x |
| [Ruler](#ruler) | x | x | | | x |
| [Bloom Compactor (Experimental)](#bloom-compactor) | x | | | | x |
| [Bloom Planner (Experimental)](#bloom-planner) | x | | | | x |
| [Bloom Builder (Experimental)](#bloom-builder) | x | | | | x |
| [Bloom Gateway (Experimental)](#bloom-gateway) | x | | | | x |

This page describes the responsibilities of each of these components.
Expand Down Expand Up @@ -56,9 +57,9 @@ Currently the only way the distributor mutates incoming data is by normalizing l

The distributor can also rate limit incoming logs based on the maximum data ingest rate per tenant. It does this by checking a per-tenant limit and dividing it by the current number of distributors. This allows the rate limit to be specified per tenant at the cluster level and enables us to scale the distributors up or down and have the per-distributor limit adjust accordingly. For instance, say we have 10 distributors and tenant A has a 10MB rate limit. Each distributor will allow up to 1MB/s before limiting. Now, say another large tenant joins the cluster and we need to spin up 10 more distributors. The now 20 distributors will adjust their rate limits for tenant A to `(10MB / 20 distributors) = 500KB/s`. This is how global limits allow much simpler and safer operation of the Loki cluster.

{{% admonition type="note" %}}
{{< admonition type="note" >}}
The distributor uses the `ring` component under the hood to register itself amongst its peers and get the total number of active distributors. This is a different "key" than the ingesters use in the ring and comes from the distributor's own [ring configuration](https://grafana.com/docs/loki/<LOKI_VERSION>/configure/#distributor).
{{% /admonition %}}
{{< /admonition >}}

### Forwarding

Expand All @@ -68,9 +69,9 @@ Once the distributor has performed all of its validation duties, it forwards dat

In order to mitigate the chance of _losing_ data on any single ingester, the distributor will forward writes to a _replication factor_ of them. Generally, the replication factor is `3`. Replication allows for ingester restarts and rollouts without failing writes and adds additional protection from data loss for some scenarios. Loosely, for each label set (called a _stream_) that is pushed to a distributor, it will hash the labels and use the resulting value to look up `replication_factor` ingesters in the `ring` (which is a subcomponent that exposes a [distributed hash table](https://en.wikipedia.org/wiki/Distributed_hash_table)). It will then try to write the same data to all of them. This will generate an error if less than a _quorum_ of writes succeed. A quorum is defined as `floor( replication_factor / 2 ) + 1`. So, for our `replication_factor` of `3`, we require that two writes succeed. If less than two writes succeed, the distributor returns an error and the write operation will be retried.

{{% admonition type="caution" %}}
{{< admonition type="caution" >}}
If a write is acknowledged by 2 out of 3 ingesters, we can tolerate the loss of one ingester but not two, as this would result in data loss.
{{% /admonition %}}
{{< /admonition >}}

The replication factor is not the only thing that prevents data loss, though, and its main purpose is to allow writes to continue uninterrupted during rollouts and restarts. The [ingester component](#ingester) now includes a [write ahead log](https://en.wikipedia.org/wiki/Write-ahead_logging) (WAL) which persists incoming writes to disk to ensure they are not lost as long as the disk isn't corrupted. The complementary nature of replication factor and WAL ensures data isn't lost unless there are significant failures in both mechanisms (that is, multiple ingesters die and lose/corrupt their disks).

Expand Down Expand Up @@ -129,9 +130,9 @@ the hash ring. Each ingester has a state of either `PENDING`, `JOINING`,
1. `PENDING` is an Ingester's state when it is waiting for a [handoff](#handoff) from
another ingester that is `LEAVING`. This only applies for legacy deployment modes.

{{% admonition type="note" %}}
{{< admonition type="note" >}}
Handoff is deprecated behavior mainly used in stateless deployments of ingesters, which is discouraged. Instead, it's recommended using a stateful deployment model together with the [write ahead log]({{< relref "../operations/storage/wal" >}}).
{{% /admonition %}}
{{< /admonition >}}

1. `JOINING` is an Ingester's state when it is currently inserting its tokens
into the ring and initializing itself. It may receive write requests for
Expand Down Expand Up @@ -205,9 +206,9 @@ nanosecond timestamps:

### Handoff

{{% admonition type="warning" %}}
{{< admonition type="warning" >}}
Handoff is deprecated behavior mainly used in stateless deployments of ingesters, which is discouraged. Instead, it's recommended using a stateful deployment model together with the [write ahead log]({{< relref "../operations/storage/wal" >}}).
{{% /admonition %}}
{{< /admonition >}}

By default, when an ingester is shutting down and tries to leave the hash ring,
it will wait to see if a new ingester tries to enter before flushing and will
Expand Down Expand Up @@ -337,28 +338,41 @@ from the query frontend.

When running multiple rulers, they use a consistent hash ring to distribute rule groups amongst available ruler instances.

## Bloom Compactor
{{% admonition type="warning" %}}
This feature is an [experimental feature](/docs/release-life-cycle/). Engineering and on-call support is not available. No SLA is provided.
{{% /admonition %}}
## Bloom Planner
{{< admonition type="warning" >}}
This feature is an [experimental feature](/docs/release-life-cycle/). Engineering and on-call support is not available.
No SLA is provided.
{{< /admonition >}}

The Bloom Compactor service is responsible for building blooms for chunks in the object store.
The Bloom Planner service is responsible for planning the tasks for blooms creation. It runs as a singleton and provides a queue
from which tasks are pulled by the Bloom Builders. The planning runs periodically and takes into account what blooms have already
been built for a given day and tenant and what series need to be newly added.

This service is also used to apply blooms retention.

## Bloom Builder
{{< admonition type="warning" >}}
This feature is an [experimental feature](/docs/release-life-cycle/). Engineering and on-call support is not available.
No SLA is provided.
{{< /admonition >}}

The Bloom Builder service is responsible for processing the tasks created by the Bloom Planner.
The Bloom Builder creates bloom blocks from structured metadata of log entries.
The resulting blooms are grouped in bloom blocks spanning multiple series and chunks from a given day.
This component also builds metadata files to track which blocks are available for each series and TSDB index file.

The service is horizontally scalable. When running multiple Bloom Compactors, they use a ring to shard tenants and
distribute series fingerprints among the available Bloom Compactor instances.
The ring is also used to decide which compactor should apply blooms retention.
The service is stateless and horizontally scalable.

## Bloom Gateway
{{% admonition type="warning" %}}
This feature is an [experimental feature](/docs/release-life-cycle/). Engineering and on-call support is not available. No SLA is provided.
{{% /admonition %}}
{{< admonition type="warning" >}}
This feature is an [experimental feature](/docs/release-life-cycle/). Engineering and on-call support is not available.
No SLA is provided.
{{< /admonition >}}

The Bloom Gateway service is responsible for handling and serving chunks filtering requests.
The index gateway queries the Bloom Gateway when computing chunk references, or when computing shards for a given query.
The gateway service takes a list of chunks and a filtering expression and matches them against the blooms,
filtering out any chunks that do not match the given filter expression.
filtering out any chunks that do not match the given label filter expression.

The service is horizontally scalable. When running multiple instances, they use a ring to shard tenants and
distribute series fingerprints across instances.
The service is horizontally scalable. When running multiple instances, the client (Index Gateway) shards requests
across instances based on the hash of the bloom blocks that are referenced.

0 comments on commit 5b27947

Please sign in to comment.