Skip to content

Commit

Permalink
Merge branch 'main' of github.com:thanos-io/thanos into limits-per-te…
Browse files Browse the repository at this point in the history
…nant-config-file
  • Loading branch information
douglascamata committed Aug 29, 2022
2 parents e5c171d + 06e5c94 commit 246ea5c
Show file tree
Hide file tree
Showing 56 changed files with 5,487 additions and 2,191 deletions.
10 changes: 5 additions & 5 deletions .busybox-versions
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Auto generated by busybox-updater.sh. DO NOT EDIT
amd64=60ded79a99eb70aa36d57c598707c961ffa4f9f7b63237823a780eaf6d437a78
arm64=d2e2aa8b7fd3d11412c4918ed35bcabc13027ed4812932a94b29940805a87176
arm=53e81f95dc293546ed76257636bd4ec60cba74297ea637aefd37fe9446b66a59
ppc64le=d3f30bf50b2d6e7097ad02a1c3d64c5ffe535b969d80d24bb9543391122d4b8c
s390x=c12940cb7882554a5cb5c582facc2a01a9f30447c1aa067c9c1cca7c9f16c7ee
amd64=d8d3654786836cad8c09543704807c7a6d75de53b9e9cd21a1bbd8cb1a607004
arm64=a3435ee186dbf88238388c112761488ecd2c264dbff8957ab73f804be62a9080
arm=b063a2176f23a13007de5c447ab3552f8e355162ac54fc2a545b00b612d4c81e
ppc64le=203c3f97bc34c4d5df50bd61beaa397f2a4c7cbd470c84fe7ec3db12409435d3
s390x=1a6eb305bd08bd1d38cb85a097ad776a78dd72b7c1a35094bb080788a39b174c
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,16 @@ NOTE: As semantic versioning states all 0.y.z releases can contain breaking chan
We use *breaking :warning:* to mark changes that are not backward compatible (relates only to v0.y.z releases.)

## Unreleased
- [#5607](https://github.com/thanos-io/thanos/pull/5607) Query: Support custom lookback delta from request in query api.
- [#5453](https://github.com/thanos-io/thanos/pull/5453) Compact: Skip erroneous empty non `*AggrChunk` chunks during 1h downsampling of 5m resolution blocks.

### Fixed
- [#5502](https://github.com/thanos-io/thanos/pull/5502) Receive: Handle exemplar storage errors as conflict error.
- [#5534](https://github.com/thanos-io/thanos/pull/5534) Query: Set struct return by query api alerts same as prometheus api.
- [#5554](https://github.com/thanos-io/thanos/pull/5554) Query/Receiver: Fix querying exemplars from multi-tenant receivers.
- [#5583](https://github.com/thanos-io/thanos/pull/5583) Query: fix data race between Respond() and query/queryRange functions. Fixes [#5410](https://github.com/thanos-io/thanos/pull/5410).
- [#5642](https://github.com/thanos-io/thanos/pull/5642) Receive: Log labels correctly in writer debug messages.
- [#5655](https://github.com/thanos-io/thanos/pull/5655) Receive: Fix recreating already pruned tenants.

### Added

Expand All @@ -25,12 +29,16 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
- [#5475](https://github.com/thanos-io/thanos/pull/5475) Compact/Store: Added `--block-files-concurrency` allowing to configure number of go routines for download/upload block files during compaction.
- [#5470](https://github.com/thanos-io/thanos/pull/5470) Receive: Implement exposing TSDB stats for all tenants
- [#5493](https://github.com/thanos-io/thanos/pull/5493) Compact: Added `--compact.blocks-fetch-concurrency` allowing to configure number of go routines for download blocks during compactions.
- [#5480](https://github.com/thanos-io/thanos/pull/5480) Query: Expose endpoint info timeout as a hidden flag.
- [#5527](https://github.com/thanos-io/thanos/pull/5527) Receive: Add per request limits for remote write.
- [#5520](https://github.com/thanos-io/thanos/pull/5520) Receive: Meta-monitoring based active series limiting.
- [#5555](https://github.com/thanos-io/thanos/pull/5555) Query: Added `--query.active-query-path` flag, allowing the user to configure the directory to create an active query tracking file, `queries.active`, for different resolution.
- [#5566](https://github.com/thanos-io/thanos/pull/5566) Receive: Added experimental support to enable chunk write queue via `--tsdb.write-queue-size` flag.
- [#5575](https://github.com/thanos-io/thanos/pull/5575) Receive: Add support for gRPC compression with snappy.
- [#5508](https://github.com/thanos-io/thanos/pull/5508) Receive: Validate labels in write requests.
- [#5439](https://github.com/thanos-io/thanos/pull/5439) Mixin: Add Alert ThanosQueryOverload to Mixin.
- [#5439](https://github.com/thanos-io/thanos/pull/5439) Add Alert ThanosQueryOverload to Mixin.
- [#5561](https://github.com/thanos-io/thanos/pull/5561) Query Frontend: Support instant query sharding.
- [#5565](https://github.com/thanos-io/thanos/pull/5565) Receive: Allow remote write request limits to be defined per file and tenant.

### Changed
Expand All @@ -39,6 +47,7 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
- [#5451](https://github.com/thanos-io/thanos/pull/5451) Azure: Reduce memory usage by not buffering file downloads entirely in memory.
- [#5484](https://github.com/thanos-io/thanos/pull/5484) Update Prometheus deps to v2.36.2.
- [#5511](https://github.com/thanos-io/thanos/pull/5511) Update Prometheus deps to v2.37.0.
- [#5588](https://github.com/thanos-io/thanos/pull/5588) Store: improve index header reading performance by sorting values first.

### Removed

Expand Down
84 changes: 26 additions & 58 deletions cmd/thanos/query.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@ import (
"fmt"
"math"
"net/http"
"path/filepath"
"strconv"
"strings"
"time"

Expand Down Expand Up @@ -147,6 +145,8 @@ func registerQuery(app *extkingpin.App) {

unhealthyStoreTimeout := extkingpin.ModelDuration(cmd.Flag("store.unhealthy-timeout", "Timeout before an unhealthy store is cleaned from the store UI page.").Default("5m"))

endpointInfoTimeout := extkingpin.ModelDuration(cmd.Flag("endpoint.info-timeout", "Timeout of gRPC Info requests.").Default("5s").Hidden())

enableAutodownsampling := cmd.Flag("query.auto-downsampling", "Enable automatic adjustment (step / 5) to what source of data should be used in store gateways if no max_source_resolution param is specified.").
Default("false").Bool()

Expand Down Expand Up @@ -279,6 +279,7 @@ func registerQuery(app *extkingpin.App) {
time.Duration(*dnsSDInterval),
*dnsSDResolver,
time.Duration(*unhealthyStoreTimeout),
time.Duration(*endpointInfoTimeout),
time.Duration(*instantDefaultMaxSourceResolution),
*defaultMetadataTimeRange,
*strictStores,
Expand Down Expand Up @@ -347,6 +348,7 @@ func runQuery(
dnsSDInterval time.Duration,
dnsSDResolver string,
unhealthyStoreTimeout time.Duration,
endpointInfoTimeout time.Duration,
instantDefaultMaxSourceResolution time.Duration,
defaultMetadataTimeRange time.Duration,
strictStores []string,
Expand Down Expand Up @@ -459,6 +461,7 @@ func runQuery(
},
dialOpts,
unhealthyStoreTimeout,
endpointInfoTimeout,
)
proxy = store.NewProxyStore(logger, reg, endpoints.GetStoreClients, component.Query, selectorLset, storeResponseTimeout)
rulesProxy = rules.NewProxy(logger, endpoints.GetRulesClients)
Expand Down Expand Up @@ -581,8 +584,14 @@ func runQuery(
grpcProbe,
prober.NewInstrumentation(comp, logger, extprom.WrapRegistererWithPrefix("thanos_", reg)),
)
engineCreator := engineFactory(promql.NewEngine, engineOpts, dynamicLookbackDelta, activeQueryDir,
maxConcurrentQueries, logger)

// An active query tracker will be added only if the user specifies a non-default path.
// Otherwise, the nil active query tracker from existing engine options will be used.
if activeQueryDir != "" {
engineOpts.ActiveQueryTracker = promql.NewActiveQueryTracker(activeQueryDir, maxConcurrentQueries, logger)
}
engine := promql.NewEngine(engineOpts)
lookbackDeltaCreator := LookbackDeltaFactory(engineOpts, dynamicLookbackDelta)

// Start query API + UI HTTP server.
{
Expand Down Expand Up @@ -612,7 +621,8 @@ func runQuery(
api := apiv1.NewQueryAPI(
logger,
endpoints.GetEndpointStatus,
engineCreator,
engine,
lookbackDeltaCreator,
queryableCreator,
// NOTE: Will share the same replica label as the query for now.
rules.NewGRPCClientWithDedup(rulesProxy, queryReplicaLabels),
Expand Down Expand Up @@ -687,7 +697,7 @@ func runQuery(
info.WithQueryAPIInfoFunc(),
)

grpcAPI := apiv1.NewGRPCAPI(time.Now, queryReplicaLabels, queryableCreator, engineCreator, instantDefaultMaxSourceResolution)
grpcAPI := apiv1.NewGRPCAPI(time.Now, queryReplicaLabels, queryableCreator, engine, lookbackDeltaCreator, instantDefaultMaxSourceResolution)
s := grpcserver.New(logger, reg, tracer, grpcLogOpts, tagOpts, comp, grpcProbe,
grpcserver.WithServer(apiv1.RegisterQueryServer(grpcAPI)),
grpcserver.WithServer(store.RegisterStoreServer(proxy)),
Expand Down Expand Up @@ -732,82 +742,40 @@ func removeDuplicateEndpointSpecs(logger log.Logger, duplicatedStores prometheus
return deduplicated
}

// engineFactory creates from 1 to 3 promql.Engines depending on
// LookbackDeltaFactory creates from 1 to 3 lookback deltas depending on
// dynamicLookbackDelta and eo.LookbackDelta and returns a function
// that returns appropriate engine for given maxSourceResolutionMillis.
//
// TODO: it seems like a good idea to tweak Prometheus itself
// instead of creating several Engines here.
func engineFactory(
newEngine func(promql.EngineOpts) *promql.Engine,
// that returns appropriate lookback delta for given maxSourceResolutionMillis.
func LookbackDeltaFactory(
eo promql.EngineOpts,
dynamicLookbackDelta bool,
activeQueryDir string,
maxConcurrentQueries int,
logger log.Logger,
) func(int64) *promql.Engine {
) func(int64) time.Duration {
resolutions := []int64{downsample.ResLevel0}
if dynamicLookbackDelta {
resolutions = []int64{downsample.ResLevel0, downsample.ResLevel1, downsample.ResLevel2}
}
var (
engines = make([]*promql.Engine, len(resolutions))
ld = eo.LookbackDelta.Milliseconds()
lds = make([]time.Duration, len(resolutions))
ld = eo.LookbackDelta.Milliseconds()
)
wrapReg := func(engineNum int) prometheus.Registerer {
return extprom.WrapRegistererWith(map[string]string{"engine": strconv.Itoa(engineNum)}, eo.Reg)
}

lookbackDelta := eo.LookbackDelta
for i, r := range resolutions {
if ld < r {
lookbackDelta = time.Duration(r) * time.Millisecond
}

newEngineOpts := promql.EngineOpts{
Logger: eo.Logger,
Reg: wrapReg(i),
MaxSamples: eo.MaxSamples,
Timeout: eo.Timeout,
LookbackDelta: lookbackDelta,
NoStepSubqueryIntervalFn: eo.NoStepSubqueryIntervalFn,
EnableAtModifier: eo.EnableAtModifier,
EnableNegativeOffset: eo.EnableNegativeOffset,
}
// An active query tracker will be added only if the user specifies a non-default path.
// Otherwise, the nil active query tracker from existing engine options will be used.
if activeQueryDir != "" {
resActiveQueryDir := filepath.Join(activeQueryDir, getActiveQueryDirBasedOnResolution(r))
newEngineOpts.ActiveQueryTracker = promql.NewActiveQueryTracker(resActiveQueryDir, maxConcurrentQueries, logger)
} else {
newEngineOpts.ActiveQueryTracker = eo.ActiveQueryTracker
}

engines[i] = newEngine(newEngineOpts)
lds[i] = lookbackDelta
}
return func(maxSourceResolutionMillis int64) *promql.Engine {
return func(maxSourceResolutionMillis int64) time.Duration {
for i := len(resolutions) - 1; i >= 1; i-- {
left := resolutions[i-1]
if resolutions[i-1] < ld {
left = ld
}
if left < maxSourceResolutionMillis {
return engines[i]
return lds[i]
}
}
return engines[0]
}
}

func getActiveQueryDirBasedOnResolution(resolution int64) string {
if resolution == downsample.ResLevel0 {
return "raw"
}
if resolution == downsample.ResLevel1 {
return "5m"
}
if resolution == downsample.ResLevel2 {
return "1h"
return lds[0]
}
return ""
}
79 changes: 31 additions & 48 deletions cmd/thanos/query_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,31 +7,15 @@ import (
"testing"
"time"

"github.com/go-kit/log"
"github.com/prometheus/prometheus/promql"

"github.com/thanos-io/thanos/pkg/testutil"
)

func TestEngineFactory(t *testing.T) {
var (
engineRaw = promql.NewEngine(promql.EngineOpts{})
engine5m = promql.NewEngine(promql.EngineOpts{LookbackDelta: 5 * time.Minute})
engine1h = promql.NewEngine(promql.EngineOpts{LookbackDelta: 1 * time.Hour})
)
mockNewEngine := func(opts promql.EngineOpts) *promql.Engine {
switch opts.LookbackDelta {
case 1 * time.Hour:
return engine1h
case 5 * time.Minute:
return engine5m
default:
return engineRaw
}
}
func TestLookbackDeltaFactory(t *testing.T) {
type testCase struct {
stepMillis int64
expect *promql.Engine
expect time.Duration
}
var (
minute = time.Minute.Milliseconds()
Expand All @@ -46,67 +30,66 @@ func TestEngineFactory(t *testing.T) {
lookbackDelta: 0,
dynamicLookbackDelta: false,
tcs: []testCase{
{0, engineRaw},
{5 * minute, engineRaw},
{1 * hour, engineRaw},
{0, time.Duration(0)},
{5 * minute, time.Duration(0)},
{1 * hour, time.Duration(0)},
},
},
{

lookbackDelta: 3 * time.Minute,
dynamicLookbackDelta: true,
tcs: []testCase{
{2 * minute, engineRaw},
{3 * minute, engineRaw},
{4 * minute, engine5m},
{5 * minute, engine5m},
{6 * minute, engine1h},
{1 * hour, engine1h},
{2 * hour, engine1h},
{2 * minute, time.Duration(3) * time.Minute},
{3 * minute, time.Duration(3) * time.Minute},
{4 * minute, time.Duration(5) * time.Minute},
{5 * minute, time.Duration(5) * time.Minute},
{6 * minute, time.Duration(1) * time.Hour},
{1 * hour, time.Duration(1) * time.Hour},
{2 * hour, time.Duration(1) * time.Hour},
},
},
{
lookbackDelta: 5 * time.Minute,
dynamicLookbackDelta: true,
tcs: []testCase{
{0, engine5m},
{5 * minute, engine5m},
{6 * minute, engine1h},
{59 * minute, engine1h},
{1 * hour, engine1h},
{2 * hour, engine1h},
{0, time.Duration(5) * time.Minute},
{5 * minute, time.Duration(5) * time.Minute},
{6 * minute, time.Duration(1) * time.Hour},
{59 * minute, time.Duration(1) * time.Hour},
{1 * hour, time.Duration(1) * time.Hour},
{2 * hour, time.Duration(1) * time.Hour},
},
},
{
lookbackDelta: 30 * time.Minute,
dynamicLookbackDelta: true,
tcs: []testCase{
{0, engineRaw},
{5 * minute, engineRaw},
{30 * minute, engineRaw},
{31 * minute, engine1h},
{59 * minute, engine1h},
{1 * hour, engine1h},
{2 * hour, engine1h},
{0, time.Duration(30) * time.Minute},
{5 * minute, time.Duration(30) * time.Minute},
{30 * minute, time.Duration(30) * time.Minute},
{31 * minute, time.Duration(1) * time.Hour},
{59 * minute, time.Duration(1) * time.Hour},
{1 * hour, time.Duration(1) * time.Hour},
{2 * hour, time.Duration(1) * time.Hour},
},
},
{
lookbackDelta: 1 * time.Hour,
dynamicLookbackDelta: true,
tcs: []testCase{
{0, engine1h},
{5 * minute, engine1h},
{1 * hour, engine1h},
{2 * hour, engine1h},
{0, time.Duration(1) * time.Hour},
{5 * minute, time.Duration(1) * time.Hour},
{1 * hour, time.Duration(1) * time.Hour},
{2 * hour, time.Duration(1) * time.Hour},
},
},
}
)
for _, td := range tData {
e := engineFactory(mockNewEngine, promql.EngineOpts{LookbackDelta: td.lookbackDelta}, td.dynamicLookbackDelta,
"", 1, log.NewNopLogger())
lookbackCreate := LookbackDeltaFactory(promql.EngineOpts{LookbackDelta: td.lookbackDelta}, td.dynamicLookbackDelta)
for _, tc := range td.tcs {
got := e(tc.stepMillis)
got := lookbackCreate(tc.stepMillis)
testutil.Equals(t, tc.expect, got)
}
}
Expand Down
Loading

0 comments on commit 246ea5c

Please sign in to comment.