-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: support TLS and authentication for Thanos Ruler queries #1939
*: support TLS and authentication for Thanos Ruler queries #1939
Conversation
97b6f73
to
51cb94e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Some comments, but generally 👍 👍 👍
Thanks!
cmd/thanos/rule.go
Outdated
} | ||
c.Transport = tracing.HTTPTripperware(logger, c.Transport) | ||
queryClient, err := http_util.NewFanoutClient(logger, cfg.EndpointsConfig, c, queryProvider.Clone()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fanout? I think the logic for rule evaluation might be different. It's fanout = broadcast approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, the logic is ok, just the Name of client might be confusing. It's not Fanout for Querier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
cmd/thanos/rule.go
Outdated
// Each Alertmanager client has a different list of targets thus each needs its own DNS provider. | ||
am, err := alert.NewAlertmanager(logger, cfg, amProvider.Clone()) | ||
amClient, err := http_util.NewFanoutClient(logger, cfg.EndpointsConfig, c, amProvider.Clone()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add tracing Tripperware as well I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cmd/thanos/rule.go
Outdated
// Discover and resolve query addresses. | ||
{ | ||
for _, c := range queryClients { | ||
addToGroup(g, c, dnsSDInterval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not consistent with alertmanager addToGroup
placement. Let's decide on one approach if we can (:
Reason: I spent some time trying to understand if we missed addToGroup
for query or not (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved the call to addToGroup
to the loop creating the query clients.
cmd/thanos/rule.go
Outdated
@@ -99,6 +97,8 @@ func registerRule(m map[string]setupFunc, app *kingpin.Application) { | |||
queries := cmd.Flag("query", "Addresses of statically configured query API servers (repeatable). The scheme may be prefixed with 'dns+' or 'dnssrv+' to detect query API servers through respective DNS lookups."). | |||
PlaceHolder("<query>").Strings() | |||
|
|||
queryConfig := extflag.RegisterPathOrContent(cmd, "query.config", "YAML file that contains query API servers configuration. See format details: https://thanos.io/components/rule.md/#configuration. If defined, it takes precedence over the '--query' and '--query.sd-files' flags.", false) | |||
|
|||
fileSDFiles := cmd.Flag("query.sd-files", "Path to file that contain addresses of query peers. The path can be a glob pattern (repeatable)."). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fileSDFiles := cmd.Flag("query.sd-files", "Path to file that contain addresses of query peers. The path can be a glob pattern (repeatable)."). | |
fileSDFiles := cmd.Flag("query.sd-files", "Path to file that contains addresses of query API servers. The path can be a glob pattern (repeatable)."). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// TODO(bwplotka): Propagate those to UI, probably requires changing rule manager code ): | ||
level.Warn(logger).Log("warnings", strings.Join(warns, ", "), "query", q) | ||
if err != nil { | ||
level.Error(logger).Log("err", err, "query", q) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do continue
instead of else
- bit more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cmd/thanos/rule.go
Outdated
} | ||
return v, nil | ||
} | ||
} | ||
return nil, errors.Errorf("no query peer reachable") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, errors.Errorf("no query peer reachable") | |
return nil, errors.Errorf("no query API server reachable") |
The peer
name is obsolete: Came from gossip times. (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cmd/thanos/rule.go
Outdated
} | ||
} | ||
return nil, errors.Errorf("no query peer reachable") | ||
} | ||
} | ||
|
||
func addToGroup(g *run.Group, c *http_util.FanoutClient, interval time.Duration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not only add, it adds for DNS & file discovery, so maybe:
func addToGroup(g *run.Group, c *http_util.FanoutClient, interval time.Duration) { | |
func addDiscoveryGroups(g *run.Group, c *http_util.FanoutClient, interval time.Duration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -738,36 +719,49 @@ func queryFunc( | |||
} | |||
|
|||
return func(ctx context.Context, q string, t time.Time) (promql.Vector, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope we can proper timeout set on caller top this (:
pkg/http/http.go
Outdated
} | ||
|
||
// FanoutClient represents a client that can send requests to a cluster of HTTP-based endpoints. | ||
type FanoutClient struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, fanout might be bad wording here, why not just
Client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm bad at naming 😄 I think I dismissedClient
because it would be confusing with http.Client
but it 's indeed better than the incorrect FanoutClient
.
cmd/thanos/rule.go
Outdated
span.Finish() | ||
for _, i := range rand.Perm(len(queriers)) { | ||
querier := queriers[i] | ||
c := promclient.NewClient(logger, querier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not creating this at the start of ruler? (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ee16698
to
8c177a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this and addressing all comments! Super tiny nits and we can merge IMO 👍
ce17a0c
to
d761ef6
Compare
Rebase needed ): |
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
d761ef6
to
2be9b4b
Compare
Closes #1778
Changes
Similar to #1838, this PR adds TLS and authentication support to Thanos Ruler for the query API endpoints. The YAML configuration format is very similar to the one used for configuring Alertmanager.
Verification
I have adapted the end-to-end tests to use the new parameters. The tests already exercised static addresses and file SD.