From ea4e8521ce1205edaf14b9f984f5d87c141b0dc7 Mon Sep 17 00:00:00 2001 From: Jay Clifford <45856600+Jayclifford345@users.noreply.github.com> Date: Tue, 23 Apr 2024 11:43:47 +0100 Subject: [PATCH 1/6] feat: Updated best practises for labels * Added Developer top tips taken from Ed's blog. * Added reference to bloom filters * Added reference to Alloy --- docs/sources/get-started/labels/bp-labels.md | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/docs/sources/get-started/labels/bp-labels.md b/docs/sources/get-started/labels/bp-labels.md index 7800345684cee..9171806ee5fca 100644 --- a/docs/sources/get-started/labels/bp-labels.md +++ b/docs/sources/get-started/labels/bp-labels.md @@ -12,7 +12,7 @@ Grafana Loki is under active development, and we are constantly working to impro ## Static labels are good -Things like, host, application, and environment are great labels. They will be fixed for a given system/app and have bounded values. Use static labels to make it easier to query your logs in a logical sense (e.g. show me all the logs for a given application and specific environment, or show me all the logs for all the apps on a specific host). +Things like regions, clusters, servers, applications, namespaces, and environments. They will be fixed for a given system/app and have bounded values. Use static labels to make it easier to query your logs in a logical sense (e.g. show me all the logs for a given application and specific environment, or show me all the logs for all the apps on a specific host). ## Use dynamic labels sparingly @@ -33,15 +33,28 @@ What you want to avoid is splitting a log file into streams, which result in chu It’s not critical that every chunk be full when flushed, but it will improve many aspects of operation. As such, our current guidance here is to avoid dynamic labels as much as possible and instead favor filter expressions. For example, don’t add a `level` dynamic label, just `|= "level=debug"` instead. +*Developer Tip:* Below are some best practices for using dynamic labels with Loki: +- *Ensure the labels have low cardinality, ideally limited to tens of values.* +- *Use labels with long-lived values, such as the initial segment of an HTTP path: `/load`, `/save`, `/update`.* + - *Do not extract ephemeral values like a trace ID or an order ID into a label; the values should be static, not dynamic.* +- *Only add labels that users will frequently use in their queries.* + - *Don’t increase the size of the index and fragment your log streams if nobody is actually using these labels. This will degrade performance.* + +### Bloom Filters (Experimental) + +As of Loki 3.0, we have also introduced [Bloom Filters]({{< relref "../../operations/query-acceleration-blooms" >}}). Loki 3.0 leverages bloom filters to speed up queries by reducing the amount of data Loki needs to load from the store and iterate through. Loki is often used to run “needle in a haystack” queries. + ## Label values must always be bounded If you are dynamically setting labels, never use a label which can have unbounded or infinite values. This will always result in big problems for Loki. Try to keep values bounded to as small a set as possible. We don't have perfect guidance as to what Loki can handle, but think single digits, or maybe 10’s of values for a dynamic label. This is less critical for static labels. For example, if you have 1,000 hosts in your environment it's going to be just fine to have a host label with 1,000 values. +As a general rule, you should try to keep any single tenant in Loki to less than **100,000 active streams**, and less than a million streams in a 24-hour period. These values are for HUGE tenants, sending more than **10 TB** a day. If your tenant is 10x smaller, you should have at least 10x less labels. + ## Be aware of dynamic labels applied by clients -Loki has several client options: [Promtail]({{< relref "../../send-data/promtail" >}}) (which also supports systemd journal ingestion and TCP-based syslog ingestion), [Fluentd]({{< relref "../../send-data/fluentd" >}}), [Fluent Bit]({{< relref "../../send-data/fluentbit" >}}), a [Docker plugin](/blog/2019/07/15/lokis-path-to-ga-docker-logging-driver-plugin-support-for-systemd/), and more! +Loki has several client options: [Grafana Alloy](https://grafana.com/docs/alloy/latest/), [Promtail]({{< relref "../../send-data/promtail" >}}) (which also supports systemd journal ingestion and TCP-based syslog ingestion), [Fluentd]({{< relref "../../send-data/fluentd" >}}), [Fluent Bit]({{< relref "../../send-data/fluentbit" >}}), a [Docker plugin](/blog/2019/07/15/lokis-path-to-ga-docker-logging-driver-plugin-support-for-systemd/), and more! Each of these come with ways to configure what labels are applied to create log streams. But be aware of what dynamic labels might be applied. Use the Loki series API to get an idea of what your log streams look like and see if there might be ways to reduce streams and cardinality. From bd64f1f9bea31ac6b1031cb3d4c459da8b3c4ca4 Mon Sep 17 00:00:00 2001 From: Jay Clifford <45856600+Jayclifford345@users.noreply.github.com> Date: Tue, 23 Apr 2024 20:37:49 +0100 Subject: [PATCH 2/6] Update docs/sources/get-started/labels/bp-labels.md Co-authored-by: J Stickler --- docs/sources/get-started/labels/bp-labels.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sources/get-started/labels/bp-labels.md b/docs/sources/get-started/labels/bp-labels.md index 9171806ee5fca..ba67f2146c10b 100644 --- a/docs/sources/get-started/labels/bp-labels.md +++ b/docs/sources/get-started/labels/bp-labels.md @@ -12,7 +12,7 @@ Grafana Loki is under active development, and we are constantly working to impro ## Static labels are good -Things like regions, clusters, servers, applications, namespaces, and environments. They will be fixed for a given system/app and have bounded values. Use static labels to make it easier to query your logs in a logical sense (e.g. show me all the logs for a given application and specific environment, or show me all the logs for all the apps on a specific host). +Use labels for things like regions, clusters, servers, applications, namespaces, and environments. They will be fixed for a given system/app and have bounded values. Use static labels to make it easier to query your logs in a logical sense (for example, show me all the logs for a given application and specific environment, or show me all the logs for all the apps on a specific host). ## Use dynamic labels sparingly From 31e0250ec963904e8cc08349f0cc88f9c27c6a4c Mon Sep 17 00:00:00 2001 From: Jay Clifford <45856600+Jayclifford345@users.noreply.github.com> Date: Tue, 23 Apr 2024 20:37:55 +0100 Subject: [PATCH 3/6] Update docs/sources/get-started/labels/bp-labels.md Co-authored-by: J Stickler --- docs/sources/get-started/labels/bp-labels.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/sources/get-started/labels/bp-labels.md b/docs/sources/get-started/labels/bp-labels.md index ba67f2146c10b..403f13511a577 100644 --- a/docs/sources/get-started/labels/bp-labels.md +++ b/docs/sources/get-started/labels/bp-labels.md @@ -33,12 +33,12 @@ What you want to avoid is splitting a log file into streams, which result in chu It’s not critical that every chunk be full when flushed, but it will improve many aspects of operation. As such, our current guidance here is to avoid dynamic labels as much as possible and instead favor filter expressions. For example, don’t add a `level` dynamic label, just `|= "level=debug"` instead. -*Developer Tip:* Below are some best practices for using dynamic labels with Loki: -- *Ensure the labels have low cardinality, ideally limited to tens of values.* -- *Use labels with long-lived values, such as the initial segment of an HTTP path: `/load`, `/save`, `/update`.* - - *Do not extract ephemeral values like a trace ID or an order ID into a label; the values should be static, not dynamic.* -- *Only add labels that users will frequently use in their queries.* - - *Don’t increase the size of the index and fragment your log streams if nobody is actually using these labels. This will degrade performance.* +Here are some best practices for using dynamic labels with Loki: +- Ensure the labels have low cardinality, ideally limited to tens of values. +- Use labels with long-lived values, such as the initial segment of an HTTP path: `/load`, `/save`, `/update`. + - Do not extract ephemeral values like a trace ID or an order ID into a label; the values should be static, not dynamic. +- Only add labels that users will frequently use in their queries. + - Don’t increase the size of the index and fragment your log streams if nobody is actually using these labels. This will degrade performance. ### Bloom Filters (Experimental) From 89a55915ceefeac2af5f7c19066d86da7e05979e Mon Sep 17 00:00:00 2001 From: Jay Clifford <45856600+Jayclifford345@users.noreply.github.com> Date: Tue, 23 Apr 2024 20:38:01 +0100 Subject: [PATCH 4/6] Update docs/sources/get-started/labels/bp-labels.md Co-authored-by: J Stickler --- docs/sources/get-started/labels/bp-labels.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sources/get-started/labels/bp-labels.md b/docs/sources/get-started/labels/bp-labels.md index 403f13511a577..7a46780d8a47c 100644 --- a/docs/sources/get-started/labels/bp-labels.md +++ b/docs/sources/get-started/labels/bp-labels.md @@ -50,7 +50,7 @@ If you are dynamically setting labels, never use a label which can have unbounde Try to keep values bounded to as small a set as possible. We don't have perfect guidance as to what Loki can handle, but think single digits, or maybe 10’s of values for a dynamic label. This is less critical for static labels. For example, if you have 1,000 hosts in your environment it's going to be just fine to have a host label with 1,000 values. -As a general rule, you should try to keep any single tenant in Loki to less than **100,000 active streams**, and less than a million streams in a 24-hour period. These values are for HUGE tenants, sending more than **10 TB** a day. If your tenant is 10x smaller, you should have at least 10x less labels. +As a general rule, you should try to keep any single tenant in Loki to less than **100,000 active streams**, and less than a million streams in a 24-hour period. These values are for HUGE tenants, sending more than **10 TB** a day. If your tenant is 10x smaller, you should have at least 10x fewer labels. ## Be aware of dynamic labels applied by clients From 3f7b1ce7a21c8f5fbc1bb3e34e03789109a4cdab Mon Sep 17 00:00:00 2001 From: Jay Clifford <45856600+Jayclifford345@users.noreply.github.com> Date: Tue, 23 Apr 2024 20:38:32 +0100 Subject: [PATCH 5/6] Update docs/sources/get-started/labels/bp-labels.md Co-authored-by: J Stickler --- docs/sources/get-started/labels/bp-labels.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sources/get-started/labels/bp-labels.md b/docs/sources/get-started/labels/bp-labels.md index 7a46780d8a47c..dabe4bac1ba27 100644 --- a/docs/sources/get-started/labels/bp-labels.md +++ b/docs/sources/get-started/labels/bp-labels.md @@ -54,7 +54,7 @@ As a general rule, you should try to keep any single tenant in Loki to less than ## Be aware of dynamic labels applied by clients -Loki has several client options: [Grafana Alloy](https://grafana.com/docs/alloy/latest/), [Promtail]({{< relref "../../send-data/promtail" >}}) (which also supports systemd journal ingestion and TCP-based syslog ingestion), [Fluentd]({{< relref "../../send-data/fluentd" >}}), [Fluent Bit]({{< relref "../../send-data/fluentbit" >}}), a [Docker plugin](/blog/2019/07/15/lokis-path-to-ga-docker-logging-driver-plugin-support-for-systemd/), and more! +Loki has several client options: [Grafana Alloy](https://grafana.com/docs/alloy/latest/), [Promtail](https://grafana.com/docs/loki//send-data/promtail/) (which also supports systemd journal ingestion and TCP-based syslog ingestion), [Fluentd](https://grafana.com/docs/loki//send-data/fluentd/), [Fluent Bit](https://grafana.com/docs/loki//send-data/fluentbit/), a [Docker plugin](https://grafana.com/docs/loki/MLOKI_VERSION>/send-data/docker-driver/), and more. Each of these come with ways to configure what labels are applied to create log streams. But be aware of what dynamic labels might be applied. Use the Loki series API to get an idea of what your log streams look like and see if there might be ways to reduce streams and cardinality. From 1c2dcdad4c65f93e8550d13aa347c17ef088ada9 Mon Sep 17 00:00:00 2001 From: J Stickler Date: Wed, 24 Apr 2024 16:33:14 -0400 Subject: [PATCH 6/6] Update docs/sources/get-started/labels/bp-labels.md --- docs/sources/get-started/labels/bp-labels.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/docs/sources/get-started/labels/bp-labels.md b/docs/sources/get-started/labels/bp-labels.md index dabe4bac1ba27..aa10867e1e61b 100644 --- a/docs/sources/get-started/labels/bp-labels.md +++ b/docs/sources/get-started/labels/bp-labels.md @@ -40,10 +40,6 @@ Here are some best practices for using dynamic labels with Loki: - Only add labels that users will frequently use in their queries. - Don’t increase the size of the index and fragment your log streams if nobody is actually using these labels. This will degrade performance. -### Bloom Filters (Experimental) - -As of Loki 3.0, we have also introduced [Bloom Filters]({{< relref "../../operations/query-acceleration-blooms" >}}). Loki 3.0 leverages bloom filters to speed up queries by reducing the amount of data Loki needs to load from the store and iterate through. Loki is often used to run “needle in a haystack” queries. - ## Label values must always be bounded If you are dynamically setting labels, never use a label which can have unbounded or infinite values. This will always result in big problems for Loki.