Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add overrides configmap to querier, update docs for new limit #1153

Merged
merged 8 commits into from
Dec 6, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 18 additions & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ jobs:
with:
version: v1.41.1

test:
name: Test
unit-tests:
name: Test packages
runs-on: ubuntu-latest
steps:
- name: Set up Go 1.17
Expand All @@ -37,7 +37,22 @@ jobs:
uses: actions/checkout@v2

- name: Test
run: make test-all
run: make test-with-cover

integration-tests:
name: Test integration e2e suite
runs-on: ubuntu-latest
steps:
- name: Set up Go 1.17
uses: actions/setup-go@v2
with:
go-version: 1.17

- name: Check out code
uses: actions/checkout@v2

- name: Test
run: make test-e2e

build:
name: Build
Expand Down
7 changes: 5 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -73,10 +73,13 @@ benchmark:
test-with-cover:
$(GOTEST) $(GOTEST_OPT_WITH_COVERAGE) $(ALL_PKGS)

.PHONY: test-e2e
test-e2e: docker-tempo
$(GOTEST) -v $(GOTEST_OPT) ./integration/e2e

# test-all/bench use a docker image so build it first to make sure we're up to date
.PHONY: test-all
test-all: docker-tempo test-with-cover
$(GOTEST) -v $(GOTEST_OPT) ./integration/e2e
test-all: test-with-cover test-e2e

.PHONY: test-bench
test-bench: docker-tempo
Expand Down
42 changes: 33 additions & 9 deletions docs/tempo/website/configuration/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -717,16 +717,20 @@ overrides:

# Global ingestion limits configurations

# Specifies whether the ingestion rate limits should be applied by each instance of the distributor and ingester
# individually, or the limits are to be shared across all instances. See the "override strategies" section for an example.
[ingestion_rate_strategy: <global|local> | default = local]

# Burst size (bytes) used in ingestion.
# Results in errors like
# RATE_LIMITED: ingestion rate limit (15000000 bytes) exceeded while adding 10 bytes
# RATE_LIMITED: ingestion rate limit (20000000 bytes) exceeded while adding 10 bytes
[ingestion_burst_size_bytes: <int> | default = 20000000 (20MB) ]

# Per-user ingestion rate limit (bytes) used in ingestion.
# Results in errors like
# RATE_LIMITED: ingestion rate limit (15000000 bytes) exceeded while
# RATE_LIMITED: ingestion rate limit (15000000 bytes) exceeded while adding 10 bytes
[ingestion_rate_limit_bytes: <int> | default = 15000000 (15MB) ]

# Maximum size of a single trace in bytes. `0` to disable.
# Results in errors like
# TRACE_TOO_LARGE: max size of trace (5000000) exceeded while adding 387 bytes
Expand All @@ -735,19 +739,22 @@ overrides:
# Maximum number of active traces per user, per ingester. `0` to disable.
# Results in errors like
# LIVE_TRACES_EXCEEDED: max live traces per tenant exceeded: per-user traces limit (local: 10000 global: 0 actual local: 1) exceeded
# This override limit is used by the ingester.
[max_traces_per_user: <int> | default = 10000]

# Maximum size of search data for a single trace in bytes. `0` to disable.
# From an operational perspective, the size of search data is proportional to the total size of all tags in a trace
[max_search_bytes_per_trace: <int> | default = 5000]

# Maximum size in bytes of a tag-values query. Tag-values query is used mainly to populate the autocomplete dropdown.
# Limit added to protect from tags with high cardinality or large values (like HTTP URLs or SQL queries)
# This override limit is used by the ingester and the querier.
[max_bytes_per_tag_values_query: <int> | default = 5000000 (5MB) ]

# Tenant-specific overrides

# tenant-specific overrides settings config file
# Tenant-specific overrides settings configuration file. See the "Tenant-specific overrides" section for an example.
[per_tenant_override_config: /conf/overrides.yaml]

# Ingestion strategy, default is `local`.
[ingestion_rate_strategy: <global|local>]
```


Expand Down Expand Up @@ -786,11 +793,28 @@ The trace limits specified by the various parameters are, by default, applied as
A setting that applies at a local level is quite helpful in ensuring that each distributor independently can process traces up to the limit without affecting the tracing limits on other distributors.

However, as a cluster grows quite large, this can lead to quite a large quantity of traces. An alternative strategy may be to set a `global` trace limit that establishes a total budget of all traces across all distributors in the cluster. The global limit is averaged across all distributors by using the distributor ring.

```yaml
# /conf/tempo.yaml
overrides:
# Ingestion strategy, default is `local`.
ingestion_rate_strategy: <global|local>
[ingestion_rate_strategy: <global|local> | default = local]
```

For example, this configuration specifies that each instance of the distributor will apply a limit of `15MB/s`.

```yaml
overrides:
- ingestion_rate_strategy: local
- ingestion_rate_limit_bytes: 15000000
```

This configuration specifies that together, all distributor instances will apply a limit of `15MB/s`.
So if there are 5 instances, each instance will apply a local limit of `(15MB/s / 5) = 3MB/s`.

```yaml
overrides:
- ingestion_rate_strategy: global
- ingestion_rate_limit_bytes: 15000000
```

## Search
Expand Down
3 changes: 3 additions & 0 deletions integration/e2e/config-all-in-one-azurite.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,6 @@ storage:
pool:
max_workers: 10
queue_depth: 100

overrides:
max_search_bytes_per_trace: 50_000
3 changes: 3 additions & 0 deletions integration/e2e/config-all-in-one-gcs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,6 @@ storage:
pool:
max_workers: 10
queue_depth: 1000

overrides:
max_search_bytes_per_trace: 50_000
3 changes: 3 additions & 0 deletions integration/e2e/config-all-in-one-s3.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,6 @@ storage:
pool:
max_workers: 10
queue_depth: 100

overrides:
max_search_bytes_per_trace: 50_000
34 changes: 25 additions & 9 deletions integration/e2e/e2e_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (
"os"
"reflect"
"strings"
"sync"
"testing"
"time"

Expand Down Expand Up @@ -221,7 +222,7 @@ func TestMicroservices(t *testing.T) {
queryAndAssertTrace(t, apiClient, info)

// stop an ingester and confirm we can still write and query
err = tempoIngester2.Stop()
err = tempoIngester2.Kill()
require.NoError(t, err)

// sleep for heartbeat timeout
Expand All @@ -237,7 +238,7 @@ func TestMicroservices(t *testing.T) {
searchAndAssertTrace(t, apiClient, info)

// stop another ingester and confirm things fail
err = tempoIngester1.Stop()
err = tempoIngester1.Kill()
require.NoError(t, err)

require.Error(t, info.EmitBatches(c))
Expand All @@ -251,12 +252,27 @@ func TestScalableSingleBinary(t *testing.T) {
minio := cortex_e2e_db.NewMinio(9000, "tempo")
require.NotNil(t, minio)
require.NoError(t, s.StartAndWaitReady(minio))
//

// copy configuration file over to shared dir
require.NoError(t, util.CopyFileToSharedDir(s, configHA, "config.yaml"))
tempo1 := util.NewTempoScalableSingleBinary(1)
tempo2 := util.NewTempoScalableSingleBinary(2)
tempo3 := util.NewTempoScalableSingleBinary(3)

// start three scalable single binary tempos in parallel
var wg sync.WaitGroup
var tempo1, tempo2, tempo3 *cortex_e2e.HTTPService
wg.Add(3)
go func() {
tempo1 = util.NewTempoScalableSingleBinary(1)
wg.Done()
}()
go func() {
tempo2 = util.NewTempoScalableSingleBinary(2)
wg.Done()
}()
go func() {
tempo3 = util.NewTempoScalableSingleBinary(3)
wg.Done()
}()
wg.Wait()
require.NoError(t, s.StartAndWaitReady(tempo1, tempo2, tempo3))

// wait for 2 active ingesters
Expand Down Expand Up @@ -314,16 +330,16 @@ func TestScalableSingleBinary(t *testing.T) {

queryAndAssertTrace(t, apiClient1, info)

err = tempo1.Stop()
err = tempo1.Kill()
require.NoError(t, err)

// Push to one of the instances that are still running.
require.NoError(t, info.EmitBatches(c2))

err = tempo2.Stop()
err = tempo2.Kill()
require.NoError(t, err)

err = tempo3.Stop()
err = tempo3.Kill()
require.NoError(t, err)
}

Expand Down
3 changes: 3 additions & 0 deletions operations/jsonnet/microservices/querier.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

local target_name = 'querier',
local tempo_config_volume = 'tempo-conf',
local tempo_overrides_config_volume = 'overrides',

tempo_querier_container::
container.new(target_name, $._images.tempo) +
Expand All @@ -23,6 +24,7 @@
]) +
container.withVolumeMounts([
volumeMount.new(tempo_config_volume, '/conf'),
volumeMount.new(tempo_overrides_config_volume, '/overrides'),
]) +
$.util.withResources($._config.querier.resources) +
$.util.readinessProbe,
Expand All @@ -44,5 +46,6 @@
}) +
deployment.mixin.spec.template.spec.withVolumes([
volume.fromConfigMap(tempo_config_volume, $.tempo_querier_configmap.metadata.name),
volume.fromConfigMap(tempo_overrides_config_volume, $._config.overrides_configmap_name),
]),
}
5 changes: 5 additions & 0 deletions operations/kube-manifests/Deployment-querier.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,12 @@ spec:
volumeMounts:
- mountPath: /conf
name: tempo-conf
- mountPath: /overrides
name: overrides
volumes:
- configMap:
name: tempo-querier
name: tempo-conf
- configMap:
name: tempo-overrides
name: overrides
4 changes: 2 additions & 2 deletions operations/kube-manifests/util/jsonnetfile.lock.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"subdir": "ksonnet-util"
}
},
"version": "ad31de265551c5577fed96d4f5a8b818027a2d85",
"version": "9927be87af4be9ff6b009e4503868b1b5493011b",
"sum": "fFVlCoa/N0qiqTbDhZAEdRm2Vv76Z9Clxp3/haJ+PyA="
},
{
Expand All @@ -18,7 +18,7 @@
"subdir": "memcached"
}
},
"version": "ad31de265551c5577fed96d4f5a8b818027a2d85",
"version": "9927be87af4be9ff6b009e4503868b1b5493011b",
"sum": "dTOeEux3t9bYSqP2L/uCuLo/wUDpCKH4w+4OD9fePUk="
},
{
Expand Down