Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(loki): allow global and per tenant sigv4 config #6358

Merged
merged 1 commit into from
Jun 13, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@
* [6317](https://github.com/grafana/loki/pull/6317/files) **dannykoping**: General: add cache usage statistics

##### Fixes
* [6358](https://github.com/grafana/loki/pull/6358) **taharah**: Fixes sigv4 authentication for the Ruler's remote write configuration by allowing both a global and per tenant configuration.
* [6152](https://github.com/grafana/loki/pull/6152) **slim-bean**: Fixes unbounded ingester memory growth when live tailing under specific circumstances.
* [5685](https://github.com/grafana/loki/pull/5685) **chaudum**: Assert that push values tuples consist of string values
##### Changes
Expand Down
42 changes: 27 additions & 15 deletions docs/sources/configuration/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -562,21 +562,7 @@ remote_write:
# Optionally configures AWS's Signature Verification 4 signing process to
# sign requests. Cannot be set at the same time as basic_auth, authorization, or oauth2.
# To use the default credentials from the AWS SDK, use `sigv4: {}`.
sigv4:
# The AWS region. If blank, the region from the default credentials chain
# is used.
[region: <string>]

# The AWS API keys. If blank, the environment variables `AWS_ACCESS_KEY_ID`
# and `AWS_SECRET_ACCESS_KEY` are used.
[access_key: <string>]
[secret_key: <secret>]

# Named AWS profile used to authenticate.
[profile: <string>]

# AWS Role ARN, an alternative to using AWS API keys.
[role_arn: <string>]
[sigv4: <sigv4_config>]

# Configures the remote write request's TLS settings.
tls_config:
Expand Down Expand Up @@ -2362,6 +2348,10 @@ The `limits_config` block configures global and per-tenant limits in Loki.
# This is experimental and might change in the future.
[ruler_remote_write_queue_retry_on_ratelimit: <boolean>]

# Configures AWS's Signature Verification 4 signing process to
# sign every remote write request.
[ruler_remote_write_sigv4_config: <sigv4_config>]

# Limit queries that can be sharded.
# Queries within the time range of now and now minus this sharding lookback
# are not sharded. The default value of 0s disables the lookback, causing
Expand All @@ -2375,6 +2365,28 @@ The `limits_config` block configures global and per-tenant limits in Loki.
[split_queries_by_interval: <duration> | default = 30m]
```
## sigv4_config
The `sigv4_config` block configures AWS's Signature Verification 4 signing process to
sign every remote write request.

```yaml
# The AWS region. If blank, the region from the default credentials chain
# is used.
[region: <string>]
# The AWS API keys. If blank, the environment variables `AWS_ACCESS_KEY_ID`
# and `AWS_SECRET_ACCESS_KEY` are used.
[access_key: <string>]
[secret_key: <secret>]

# Named AWS profile used to authenticate.
[profile: <string>]

# AWS Role ARN, an alternative to using AWS API keys.
[role_arn: <string>]
```
### grpc_client_config
The `grpc_client_config` block configures a client connection to a gRPC service.
Expand Down
6 changes: 4 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,10 @@ require (
k8s.io/klog v1.0.0
)

require github.com/willf/bloom v2.0.3+incompatible
require (
github.com/prometheus/common/sigv4 v0.1.0
github.com/willf/bloom v2.0.3+incompatible
)

require (
cloud.google.com/go v0.100.2 // indirect
Expand Down Expand Up @@ -228,7 +231,6 @@ require (
github.com/pierrec/lz4 v2.6.1+incompatible // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/prometheus/alertmanager v0.23.1-0.20210914172521-e35efbddb66a // indirect
github.com/prometheus/common/sigv4 v0.1.0 // indirect
github.com/prometheus/node_exporter v1.0.0-rc.0.0.20200428091818-01054558c289 // indirect
github.com/prometheus/procfs v0.7.3 // indirect
github.com/rcrowley/go-metrics v0.0.0-20201227073835-cf1acfcdf475 // indirect
Expand Down
2 changes: 2 additions & 0 deletions pkg/ruler/compat.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/common/model"
"github.com/prometheus/common/sigv4"
"github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/model/rulefmt"
"github.com/prometheus/prometheus/model/timestamp"
Expand Down Expand Up @@ -49,6 +50,7 @@ type RulesLimits interface {
RulerRemoteWriteQueueMinBackoff(userID string) time.Duration
RulerRemoteWriteQueueMaxBackoff(userID string) time.Duration
RulerRemoteWriteQueueRetryOnRateLimit(userID string) bool
RulerRemoteWriteSigV4Config(userID string) *sigv4.SigV4Config
}

// engineQueryFunc returns a new query function using the rules.EngineQueryFunc function
Expand Down
5 changes: 4 additions & 1 deletion pkg/ruler/registry.go
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,6 @@ func (r *walRegistry) getTenantRemoteWriteConfig(tenant string, base RemoteWrite
// TODO(dannyk): configure HTTP client overrides
// metadata is only used by prometheus scrape configs
overrides.Client.MetadataConfig = config.MetadataConfig{Send: false}
overrides.Client.SigV4Config = nil

if r.overrides.RulerRemoteWriteDisabled(tenant) {
overrides.Enabled = false
Expand Down Expand Up @@ -296,6 +295,10 @@ func (r *walRegistry) getTenantRemoteWriteConfig(tenant string, base RemoteWrite
overrides.Client.QueueConfig.RetryOnRateLimit = v
}

if v := r.overrides.RulerRemoteWriteSigV4Config(tenant); v != nil {
overrides.Client.SigV4Config = v
}

return overrides, nil
}

Expand Down
51 changes: 51 additions & 0 deletions pkg/ruler/registry_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import (
"github.com/go-kit/log"
promConfig "github.com/prometheus/common/config"
"github.com/prometheus/common/model"
"github.com/prometheus/common/sigv4"
"github.com/prometheus/prometheus/config"
"github.com/prometheus/prometheus/model/relabel"
"github.com/stretchr/testify/assert"
Expand All @@ -31,6 +32,9 @@ const customRelabelsTenant = "custom-relabels"
const badRelabelsTenant = "bad-relabels"
const nilRelabelsTenant = "nil-relabels"
const emptySliceRelabelsTenant = "empty-slice-relabels"
const sigV4ConfigTenant = "sigv4"
const sigV4GlobalRegion = "us-east-1"
const sigV4TenantRegion = "us-east-2"

const defaultCapacity = 1000

Expand Down Expand Up @@ -80,6 +84,11 @@ func newFakeLimits() fakeLimits {
},
},
},
sigV4ConfigTenant: {
RulerRemoteWriteSigV4Config: &sigv4.SigV4Config{
Region: sigV4TenantRegion,
},
},
},
}
}
Expand Down Expand Up @@ -134,6 +143,19 @@ func setupRegistry(t *testing.T) *walRegistry {
return reg.(*walRegistry)
}

func setupSigV4Registry(t *testing.T) *walRegistry {
// Get the global config and override it
reg := setupRegistry(t)

// Remove the basic auth config and replace with sigv4
reg.config.RemoteWrite.Client.HTTPClientConfig.BasicAuth = nil
reg.config.RemoteWrite.Client.SigV4Config = &sigv4.SigV4Config{
Region: sigV4GlobalRegion,
}

return reg
}

func TestTenantRemoteWriteConfigWithOverride(t *testing.T) {
reg := setupRegistry(t)

Expand All @@ -159,6 +181,35 @@ func TestTenantRemoteWriteConfigWithoutOverride(t *testing.T) {
assert.Equal(t, tenantCfg.RemoteWrite[0].QueueConfig.Capacity, defaultCapacity)
}

func TestRulerRemoteWriteSigV4ConfigWithOverrides(t *testing.T) {
reg := setupSigV4Registry(t)

tenantCfg, err := reg.getTenantConfig(sigV4ConfigTenant)
require.NoError(t, err)

// tenant has not disable remote-write so will inherit the global one
assert.Len(t, tenantCfg.RemoteWrite, 1)
// ensure sigv4 config is not nil and overwritten
if assert.NotNil(t, tenantCfg.RemoteWrite[0].SigV4Config) {
assert.Equal(t, tenantCfg.RemoteWrite[0].SigV4Config.Region, sigV4TenantRegion)
}
}

func TestRulerRemoteWriteSigV4ConfigWithoutOverrides(t *testing.T) {
reg := setupSigV4Registry(t)

// this tenant has no overrides, so will get defaults
tenantCfg, err := reg.getTenantConfig("unknown")
require.NoError(t, err)

// tenant has not disable remote-write so will inherit the global one
assert.Len(t, tenantCfg.RemoteWrite, 1)
// ensure sigv4 config is not nil and the global value
if assert.NotNil(t, tenantCfg.RemoteWrite[0].SigV4Config) {
assert.Equal(t, tenantCfg.RemoteWrite[0].SigV4Config.Region, sigV4GlobalRegion)
}
}

func TestTenantRemoteWriteConfigDisabled(t *testing.T) {
reg := setupRegistry(t)

Expand Down
6 changes: 6 additions & 0 deletions pkg/validation/limits.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import (

"github.com/pkg/errors"
"github.com/prometheus/common/model"
"github.com/prometheus/common/sigv4"
"github.com/prometheus/prometheus/model/labels"
"golang.org/x/time/rate"
"gopkg.in/yaml.v2"
Expand Down Expand Up @@ -108,6 +109,7 @@ type Limits struct {
RulerRemoteWriteQueueMinBackoff time.Duration `yaml:"ruler_remote_write_queue_min_backoff" json:"ruler_remote_write_queue_min_backoff"`
RulerRemoteWriteQueueMaxBackoff time.Duration `yaml:"ruler_remote_write_queue_max_backoff" json:"ruler_remote_write_queue_max_backoff"`
RulerRemoteWriteQueueRetryOnRateLimit bool `yaml:"ruler_remote_write_queue_retry_on_ratelimit" json:"ruler_remote_write_queue_retry_on_ratelimit"`
RulerRemoteWriteSigV4Config *sigv4.SigV4Config `yaml:"ruler_remote_write_sigv4_config" json:"ruler_remote_write_sigv4_config"`

// Global and per tenant retention
RetentionPeriod model.Duration `yaml:"retention_period" json:"retention_period"`
Expand Down Expand Up @@ -512,6 +514,10 @@ func (o *Overrides) RulerRemoteWriteQueueRetryOnRateLimit(userID string) bool {
return o.getOverridesForUser(userID).RulerRemoteWriteQueueRetryOnRateLimit
}

func (o *Overrides) RulerRemoteWriteSigV4Config(userID string) *sigv4.SigV4Config {
return o.getOverridesForUser(userID).RulerRemoteWriteSigV4Config
}

// RetentionPeriod returns the retention period for a given user.
func (o *Overrides) RetentionPeriod(userID string) time.Duration {
return time.Duration(o.getOverridesForUser(userID).RetentionPeriod)
Expand Down
5 changes: 5 additions & 0 deletions pkg/validation/limits_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,8 @@ split_queries_by_interval: 190s
ruler_evaluation_delay_duration: 200s
ruler_max_rules_per_rule_group: 210
ruler_max_rule_groups_per_tenant: 220
ruler_remote_write_sigv4_config:
region: us-east-1
per_tenant_override_config: ""
per_tenant_override_period: 230s
`
Expand Down Expand Up @@ -96,6 +98,9 @@ per_tenant_override_period: 230s
"ruler_evaluation_delay_duration": "200s",
"ruler_max_rules_per_rule_group": 210,
"ruler_max_rule_groups_per_tenant":220,
"ruler_remote_write_sigv4_config": {
"region": "us-east-1"
},
"per_tenant_override_config": "",
"per_tenant_override_period": "230s"
}
Expand Down