Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query Frontend: new service for queuing and retrying queries. #910

Merged
merged 10 commits into from
Aug 16, 2018

Conversation

tomwilkie
Copy link
Contributor

@tomwilkie tomwilkie commented Jul 30, 2018

A new job which accepts HTTP requests and queues them by tenant ID, retrying them on errors. Aim is to allow for the occasional large query to cause a querier OOM without failing requests, allowing us to over-provision querier parallelism. Also, prevent multiple large request from blocking a single querier by distributing them FIFO across all queriers. And finally, prevent a single tenant from DoSing other tenants by fairly scheduling queries between tenants.

The query frontend job accepts gRPC streaming requests from the queriers, which then "pull" requests from the frontend. For HA it is recommended you run multiple frontends - the queriers will connect to (and pull requests from) both. To get the benefit of the fair scheduling, it is recommended you run fewer frontends than queriers - two should suffice.

See https://docs.google.com/document/d/1lsvSkv0tiAMPQv-V8vI2LZ8f4i9JuTRsuPI_i-XcAqY for more details design discussion. In the future, query splitting, query alignment and query results caching will be added to the frontend.

Signed-off-by: Tom Wilkie tom.wilkie@gmail.com

@tomwilkie tomwilkie force-pushed the query-frontend branch 2 times, most recently from 402946c to e65cef1 Compare July 30, 2018 16:28
@tomwilkie tomwilkie changed the title [WIP] Query Frontend: new service for queuing and retrying queries. Query Frontend: new service for queuing and retrying queries. Jul 31, 2018
)

var (
queueDutation = promauto.NewHistogram(prometheus.HistogramOpts{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo? I'm assuming queueDuration, but I've seen stranger things.. :-)

Help: "Number of queries in the queue.",
})

errServerClosing = httpgrpc.Errorf(http.StatusTeapot, "server closing down")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice status code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping for a official "closing down" status code, but I couldn't find anything...

@csmarchbanks
Copy link
Contributor

Overall this looks good to me. Would you be willing to add a doc that describes the two options between running just a querier, and running a front-end + queriers are? Mostly what is in your PR description, but somewhere that is easier to find in the future.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
@tomwilkie
Copy link
Contributor Author

@csmarchbanks PTAL, added some documentation.


### Query Frontend

The query frontend is an optional job which accepts HTTP requests and queues them by tenant ID, retrying them on errors. This allow for the occasional large query which would otherwise cause a querier OOM, allowing us to over-provision querier parallelism. Also, it prevents multiple large from being convoyed on a single querier by distributing them FIFO across all queriers. And finally, it prevent a single tenant from DoSing other tenants by fairly scheduling queries between tenants.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are missing a word after "prevents multiple large"

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
@tomwilkie
Copy link
Contributor Author

@csmarchbanks thanks, PTAL

Copy link
Contributor

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like possible flaky test, otherwise :shipit:

@tomwilkie tomwilkie merged commit 57b71bb into cortexproject:master Aug 16, 2018
@tomwilkie tomwilkie deleted the query-frontend branch August 16, 2018 16:35
@@ -66,7 +66,7 @@ func (s Server) Handle(ctx context.Context, r *httpgrpc.HTTPRequest) (*httpgrpc.
if recorder.Code/100 == 5 {
return nil, httpgrpc.ErrorFromHTTPResponse(resp)
}
return resp, err
return resp, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing should be changed in vendor without going via dep.

bboreham added a commit to weaveworks/common that referenced this pull request Aug 23, 2018
* Refactor: extract functions HTTPRequest(), WriteResponse(), WriteError()

* Regenerate httpgrpc.pb.go with protoc-gen-gogo

from https://github.com/gogo/protobuf

Change taken from
cortexproject/cortex@f807690
where no explanation was given.
leth pushed a commit to weaveworks/scope that referenced this pull request Aug 28, 2018
Changes:
- Default diff-printer to hide details (#103) - da4c3ff
- Improve logging of http errors (#115) - f5a1710
- Update httpgrpc to match cortexproject/cortex#910 (#117) - 80ff076
- Expose the HTTP server from the server struct. (#118) - 1a7a6b3
- Add HTTP tracing middleware (#119) - d442d08
tomwilkie added a commit to grafana/cortex that referenced this pull request Sep 4, 2018
Was left over from the mistakes in cortexproject#910.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
charleskorn pushed a commit to grafana/dskit that referenced this pull request Aug 3, 2023
* Refactor: extract functions HTTPRequest(), WriteResponse(), WriteError()

* Regenerate httpgrpc.pb.go with protoc-gen-gogo

from https://github.com/gogo/protobuf

Change taken from
cortexproject/cortex@f807690
where no explanation was given.
charleskorn added a commit to grafana/dskit that referenced this pull request Aug 4, 2023
* Exempt 503 (Service Unavailable) from error logging (#97)

Part of cortexproject/cortex#810

* Add gRPC streaming middleware. (#95)

* Add gRPC streaming middleware.

Adds a series of gRPC stream middleware for logging, user header propagation and tracing. This is laying groundwork for using gRPC streaming between ingesters and queriers in Cortex.

* Update the vendored versions.
* Pin kuberesolver because the API has changed.

* Always put code under github.com/weaveworks/common;

Dependancies between packages won't work if the code is checked out in github.com/tomwilkie, for instance.  Also, common is a very common name for a repo, so its not uncommon for the repo to be called something like weaveworks-common.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* When trace reporting is enabled, don't enable trace sampling unless asked (#102)

* Don't enable trace sampling by default

* Don't export new unused member, simplify docstring

* Abstract logging for the middleware (#96)

* Move dedupe code to dedupe.go

* Add unified logging interface.

* Adapt middleware/, user/, signals/ and server/ to new logging interfaces.

Allow user to specify logging implementation in server, if non is specified default to logrus.
As part of this we need log level as server config, and for convenience we expose the server logger.

* Update vendoring

* Expose the various level types.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* Don't shutdown the gRPC server the minute a signal is received. (#99)

gRPC behaviour should be the same as HTTP: Run() exits when a signal is received, and Stop() stops the handling of requests.

And as we no longer call GracefulShutdown in Run, call it in Stop - one should always try to shutdown gracefully.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* Add user and org ID to span if known (#100)

* Propagate tracing context over httpgrpc (#105)

* Propagate tracing context over httpgrpc

* Log warnings for failed trace inject/extract

* Don't use full URL as metric label on unmatched path (#108)

Random people on the Internet can send in arbitrary paths, so collapse
them all into "other" when reporting metrics.

* Raise default server timeouts from 5s to 30s (#109)

Primary motivation is to allow a standard Go profiling request to
complete, for which I only need the write timeout, but I raised the
other 5-second timeouts too since it isn't that unusual to have
operations or data uploads take longer.

* Return errors when http/grpc servers exit. (#92)

* Return errors when http/grpc servers exit.

There was a case where the HTTP server died but the process exited. We
need to exit when the servers exit.

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Expose jaeger metrics (#114)

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Improve logging of http errors (#115)

* Improve http request logging

When logging an http request, e.g. on 500 error, put all the headers
and response on one line.

Also strip cookies and csrf tokens on security grounds.

* Add a test to generate an http error to check formatting

* Update httpgrpc to match cortexproject/cortex#910 (#117)

* Refactor: extract functions HTTPRequest(), WriteResponse(), WriteError()

* Regenerate httpgrpc.pb.go with protoc-gen-gogo

from https://github.com/gogo/protobuf

Change taken from
cortexproject/cortex@f807690
where no explanation was given.

* Expose the HTTP server from the server struct. (#118)

* Expose the HTTP server from the server struct.

This allows users to inject HTTP requests and have the logged, instrumented and traced correctly.

* Fix #120: buffer the errChan, as one of the servers could return an error before Run starts listening on errChan.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* Add HTTP tracing middleware (#119)

* Print traceID with request (#123)

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Also set global logging when initializing logrus (#122)

* Refactor

- s/logusEntry/logrusEntry

* Also set global logging when initializing logrus

This PR sets the global logging instance when calling `logging.Setup(level)`.
Previously, only the standard logrus logger was set up and if anyone used
the new funcs such as `logging.Info()` it was discarded.

* Don't trace http requests when using httpgrpc. (#124)

* Don't trace http requests when using httpgrpc.

* Demonstrate how the same can be acheived with middleware.Tracer.

* Print logrus output in logfmt format (#121)

* Print logrus output in logfmt format

This changes the way logrous outputs log messages in case there is no
TTY attached.

Changes from

        ERRO: 2018/08/23 20:28:10.075952 Sample log message extra=foo

to

        time="2018-08-23T20:28:10Z" level=info msg="Sample log message" extra="foo"

which follows the `logfmt` key-value format by also providing `time`,
`level`, and `msg` in adition to the extra fields.

The PR removes the custom `textFormatter` to have the default
`logrus.TextFormatter` in place which takes care of detecting whether a
terminal is attached or not and switches between text and logfmt output
automatically.
It appears the custom formatter was to prevent differences in rendering
depending on color/non-color output which I don't necessarily see as an
issue.

* dep ensure to fix tests

* Update more golang/protobuf to gogo/protobuf (#127)

* Update 'dep' to 0.5.0

* Update more golang/protobuf to gogo/protobuf

* add constraints for downstream users

* Label http request handler tracing spans with the matched route name (#126)

* Pass options to tracing middleware

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Use the route matcher instead

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Fix dropped namespace in url parsing (#128)

* Update grpc-opentracing to latest version at new home (#129)

* go-grpc-middleware has moved (#130)

* Update to latest kuberesolver (#131)

* Update to latest kuberesolver and use a load balancing policy

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Log which ports the server is listening on. (#143)

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* server: add PathPrefix to config for all routes

This can be used to serve an API from a prefix (e.g. /prometheus/)
without having to re-write every HTTP Handler.

* Allow server.Config to be unmarshaled from YAML.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* Dont flag context.Canceled as an error in tracing

Cancellations are always caused by something else, so having them show
up as errors in the tracing UI is distracting.

* Allow overriding max message sizes and concurrent streams in the server.

NB This use the gRPC defaults as the flag defaults.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* exclude Authorization headers from log messages.

* server: use -server.path-prefix for metrics and debug endpoints

Previously calling .Subrouter() would overwrite the instrumentation
handlers and they'd be ignored when PathPrefix was specified. This
meant one could never get back metrics or debug endpoints.

* Skip logging context cancelled warning msgs

Context cancellations are controlled by clients and logging this message can
cause more more confusion as mentioned here:
cortexproject/cortex#1279

Signed-off-by: Neeraj Poddar <neeraj@aspenmesh.io>

* Allow log levels to be marshalled to YAML.  Test it works.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* Fix lint warning

* Expose RegisterInstrumentation setting

Fix #156

Signed-off-by: Xiang Dai <764524258@qq.com>

* Update prometheus/client_golang and prometheus/procfs

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Allow user to specify HTTP and gRPC bind addresses (not just ports.)

This allows us to use 'localhost' in tests, and prevents an 'Allow connections from...' dialog on MacOS when running unit tests.

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>

* Change naming from host to address

Signed-off-by: Thore Kruess <thore@kruess.xyz>

* Revert changes to server_test.go

* Extend tests to check server works with flag defaults

(with exception of http port, which we override for the convenience of
people running the tests as an ordinary user who can't bind to port 80)

* limit the number of connections to server

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* Tell Go runtime we have stopped listening to signals

If we exit the handler loop, e.g. after receiving SIGTERM, then we
don't want Go to attempt further delivery of signals.

* Fix kuberesolver URLs to have three slashes

* Add more test cases prompted by review feedback

* Trace Websocket requests

Update the OpenTracing library to a version which supports Websockets,
and remove the workaround for when it didn't.

* Update gogo proto to 1.3.0

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* add flags to configure keep alive configs (#174)

* add flags & yaml to configure keep alive configs

Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>

* Fixed gRPC server keepalive flags registration

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Refactor: extract common error decoding

* Re-do protogen for fake_server with current gRPC

* Extend server tests with canceled gRPC functions

* Extend cancelation checking to include gRPC errors

* Extend error instrumentation tests to http

* Instrument canceled gRPC calls with a unique label

* Return error in configuring jaeger

* add log format option

* Add tls support to http server

Signed-off-by: Annanay <annanayagarwal@gmail.com>

* Use require package in tests, correct TLS fields in server config

Signed-off-by: Annanay <annanayagarwal@gmail.com>

* Load certs only if config options are specified

Signed-off-by: Annanay <annanayagarwal@gmail.com>

* Add metrics namespace to server

Signed-off-by: Annanay <annanayagarwal@gmail.com>

* use same system as level for format

* Add tls support to grpc server, add dummy certs

Signed-off-by: Annanay <annanayagarwal@gmail.com>

* Addressed @tomwilkie's review comments

Signed-off-by: Annanay <annanayagarwal@gmail.com>

* Fix broken tests

Signed-off-by: Annanay <annanayagarwal@gmail.com>

* Add TODO note about using copied function

Signed-off-by: Annanay <annanayagarwal@gmail.com>

* Use ConfigToTLSConfig util function from node_exporter

Signed-off-by: Annanay <annanayagarwal@gmail.com>

* Log errors when write fails, not 200

See discussion here: cortexproject/cortex#2483 (comment)

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Address @bboreham's comments

Signed-off-by: Annanay <annanayagarwal@gmail.com>

* fix function documentation

* Move shebang to first line, add commit hash to link

Signed-off-by: Annanay <annanayagarwal@gmail.com>

* Add test and warn only if not context cancelled

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>

* Added option to disable signal handling by Server. (#191)

* Make signal-handling by Server configurable.

* Fixed unrelated lint warnings.

* Log X-Forwarded-For for every request

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Address review comments and add Forwarded and X-Real-IP support

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Remove making standard middleware optional

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Remove unused fields as well

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Add option to turn logging of source IPs on

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Add sourceIPs tag in tracing

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Add custom regex for extracting IP from header

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Rename file so it's clearer what it does

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Use better names

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Do no use log prefix in config var name

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Address review comments

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Use same way of access struct

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Address review comments

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Replace interceptor with httpsnoop library.

This library supports additional interfaces other than http.Hijacker.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

* Metrics for HTTP and gRPC message sizes and inflight requests.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

* Added metrics test.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

* s/Kuberneretes/Kubernetes/

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Make HTTP middleware optional (#194)

* Make HTTP middleware optional
* Allow the Router to be set by caller

Signed-off-by: Michel Hollands <michel.hollands@grafana.com>

* Add exemplar support to middleware

Signed-off-by: beorn7 <beorn@grafana.com>

* tracing: Allow specifying JAEGER_ENDPOINT instead of sampling server or local agent port

closes #203

* Expose grpc keepalive enforcement policy options.

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

* s/faviourite/preferred

Changed the wording to avoid US/UK English concerns.

* Set the content-length of httpgrpc request.

http.NewRequest can't set it while using a `io.NopCloser`.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Replace `ioutil.NopCloser` with a custom one.

This way we can easily get direct access to the underlaying `bytes.Buffer`. This means we don't need to copy the data anymore.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>

* Fix panic in server i13n matching NotFoundHandler

When router can't find the route but it has the NotFoundHandler set, the
RouteMatch doesn't have the Route set but it has MatchErr set to
ErrNotFound.

Added a check for RouteMatch.Route to be not nil before accessing it,
and added a special "notfound" route name to be used when
NotFoundHandler is used.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Unindent getRouteName by using early returns

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* support custom registerer

* Add metric "tcp_connections" for number of TCP connections to server.

Counts the number of accepted connections and the number of closed
connections, by intercepting the listener for each port (http/grpc).

Signed-off-by: Steve Simpson <steve.simpson@grafana.com>

* add tcpv4 support

* tracing: add options to NewFromEnv

Signed-off-by: Dave Henderson <dhenderson@gmail.com>

* Refactor: move ExtractTraceID functions to tracing package

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Add exemplars to 'instrument' package

Whenever a duration is observed and we have a trace ID, store it
as a Prometheus exemplar.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* refactor: re-use ObserveWithExemplar function

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Add exemplars to gRPC instrumentation

We need to call the tracing interceptor before the metric interceptor
so that the latter can extract the trace ID.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Allow blank network in server.New

There are a lot of tests which create a `server.Config{}` and don't
populate the network fields; allow this by setting `DefaultNetwork` if
blank.

Also pick up a bugfix where `HTTPListenNetwork` was used for gRPC.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* gRPC stream middleware: move metrics after tracing

So that we can pass the trace ID as an exemplar.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Migrating from go-kit/kit/log to the slimmer go-kit/log

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>

* Add benchmark for Debugf()

Extract function to wrap with level, caller, timestamp.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Filter by log level before anything else

This saves a lot of work walking the stack to find the caller, and a bit
more to format the timestamp.

Recommended at go-kit/log#14 (comment)

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Avoid Sprintf when log line is filtered out

Defer the Sprintf until after the level check, by using an auxilliary struct

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Revert "logging: Add ability to deduplicate log messages (#48)"

This reverts commit 0aecff0.

It has never de-duplicated a single line for us in production.

* fix master build

- run `go mod tidy`
- change logging test to use `go-kit/log`

* replace node_exporter https package with exporter-toolkit

The node_exporter/https package has been moved to exporter-toolkit as of
v1.1.0. weaveworks/common was still using this package, which was
preventing downstream importers (i.e., grafana/agent) from updating
their dependency on node_exporter.

This change also required bumping kuberesolver to v2.4.0 (the next
version after v2.1.0) which uses a newer version of the gRPC library
where type names have changed.

kuberesolver diff: sercand/kuberesolver@v2.1.0...v2.4.0

* Added tcp_connections_limit metric

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Add a tag with the user-agent to traces

This is useful to understand which kind of client is behaving in a
certain way.

Signed-off-by: Christian Simon <simon@swine.de>

* undo whitespace changes to server/server_test.go

* Merged both tag setters into a single MWSpanObserver

Signed-off-by: Christian Simon <simon@swine.de>

* middleware: Add gRPC instrumentation utilities from dskit

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* middleware: Rename client instrumentation functions

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* middleware: Move client instrumentation functions to grpc_instrumentation.go

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* middleware: Slight cleanup

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* add flag to log requests at info level

* fix comment for LogRequestsAtInfoLevel

Co-authored-by: Victor Cinaglia <victorcinaglia@gmail.com>

* add yaml tag & add LogRequestsAtInfoLevel to RegisterFlags

* add method to set log level for logging with request

* Add user and org labels to observed exemplars

Exemplars provide information about traces, and traces usually reference
to the user that served the request, however, searching the user in the
trace is usually a tedious task, and it would be nice to be able to
overwiew whether the slow requests (exemplars) from your panel
corresponds to the same user.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Revert "add method to set log level for logging with request"

This reverts commit 3422262.

* call logWithRequest once

* middleware: Simplify as suggested by @pracucci

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* fix typo in registering LogRequestAtInfoLevel flag

* Removed debug log on successfuly gRPC requests from GRPCServerLog

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Make debug logging on successful gRPC requests from GRPCServerLog optional

* Update server/server.go

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Apply changes according to code review

* server: Remove advanced TLS config parameters

Remove advanced TLS config parameters stemming from
github.com/prometheus/exporter-toolkit/web, that were introduced in commit
56b8fa7. Motivation for their removal being
that users would most likely not want to change them, and they add corresponding
configuration parameters to the Grafana Mimir project, that we don't want. We
also think they're not interesting to the Grafana Tempo and Loki projects.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

* Revert "Add user and org labels to observed exemplars"

* server: Expose `http` and `grpc` listen addresses.

The main rationale is for "choosing" random unbinded port for testing.

It's one of the Go's idiom to pass port number `0` to enforce Go's net's package to bind to some random port that is free.

```
func main() {
    httpListener, err := net.Listen("tcp", "0.0.0.0:0")
    if err != nil {
        panic(err)
    }

    grpcListener, err := net.Listen("tcp", "0.0.0.0:0")
    if err != nil {
        panic(err)
    }

    h := http.Server{}
    go func() {
        log.Println("http serving at", httpListener.Addr())
        h.Serve(httpListener)
    }()

    g := grpc.Server{}
    go func() {
        log.Println("grpc serving at", grpcListener.Addr())
        g.Serve(grpcListener)
    }()

    time.Sleep(20 * time.Second)
}
```

And that will bind random "unassigned" port.

```
2022/08/08 16:51:49 grpc serving at [::]:35359
2022/08/08 16:51:49 http serving at [::]:44671
```

The problem is, currently there is no way to know what those ports are.

This PR exposes those addresses so that, we can test server easily and pass on the listen addresses to all it's clients.
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>

* Bugfix server tcp metrics to use the configured registry or the default Prometheus registry

Server TCP related metrics tcp_connections and tcp_connections_limit are currently always registered with the default prometheus registry: https://github.com/prometheus/client_golang/blob/main/prometheus/registry.go#L57
As a result, it can cause an attempt to register duplicate metrics collector failing with a panic.

This commit updates these metrics to use the configured Registry, if any, otherwise, the default Prometheus Registry.
log and gatherer initializations were also moved for consistency.

* Fix incorrect import of obsolete github.com/go-kit/kit/log package

Signed-off-by: Dave Henderson <dhenderson@gmail.com>

* httpgrpc/server: Update NewClient to not use WithBalancerName

On projects consuming this module as a library, gRPC might be set at
newer versions where the deprecated #WithBalancerName has been removed
already. Given that the version used by this module already offers a
forward-compatible method, this commit uses that instead of the
deprecated #WithBalancerName.

This is a simplified version of #240, without touching the gRPC versions.

Fixes #239

Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>

* Fix 'make protos'

Use a 3rd-party image `namely/protoc:1.22_1`, which generates identical
output for two of the protos in this repo.

The third one, httpgrpc, I have not managed to match exactly, but the
result is very close.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Implement `http.Flusher` interface on Log middleware (#257)

This adds `Flush()` to the `middleware.badResponseLoggingWriter`, making it implement `http.Flusher` if the wrapped `http.ResponseWriter` does.

* Allow config of TLS cipher suites and min version (#256)

* Server: allow config of TLS cipher suites and min version

Add a single parameter for each, not split across HTTP and gRPC.

Required change upstream - prometheus/exporter-toolkit#110.

Downstream projects rely on CLI parameters to generate docstrings,
so we add `--server.tls-cipher-suites` and `--server.tls-min-version`.
Both CLI and yaml require comma-separated lists of cipher suites,
which is different to the yaml array format supported by prometheus/exporter-toolkit.

The names accepted are from Go, listed here: https://pkg.go.dev/crypto/tls#pkg-constants

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Tweak TLS server option help text

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Reproducable httpgrpc.pb.go build

Use gogo protobuf options explicitly to generate the same Equal, String,
GoString etc functions in the protobug implementation.

Add tools.go to download the gogo.proto dependency.
Add vendor directory as include for protoc to load the dependency.
Update protoc version otherwise there is a diff in the generated files,
most notably
const _ = proto.GoGoProtoPackageIsVersion3
is replaced with
const _ = proto.GoGoProtoPackageIsVersion2

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Make gogo.proto import vendoring agnostic

prometheus/alertmanager imports gogo.proto as gogoproto/gogo.proto
This is because it does not use vendoring and needs to look up gogoproto
in the module cache, which results in a path like
/home/user/go/pkg/mod/github.com/gogo/protobuf@v1.3.2
To avoid having @v1.3.2 in the proto definition it then sets this
as an include path and import only gogoproto/gogo.proto.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Updated prometheus/exporter-toolkit to v0.8.1

* Fixed tests

* Use ChainInterceptor from upstream grpc

And eliminate dependency on `grpc-ecosystem/go-grpc-middleware`.
Functionality is identical; the feature was added upstream in v1.21 for
client side and v1.28 for server side.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Small testcode lint fixes (#269)

* Use built in string conversion of recoder.Body
* Defer close after error check

VS Code complains otherwise.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* server test: don't compare error for exact equality

DeepEqual is brittle against changes in implementation.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* server: remove unused fields from Client struct

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* replace deprecated 'grpc.WithInsecure'

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* middleware: fix lint complaint about buf.String

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* miscelaneous lint fixes

Checking errors, etc.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* server: check error before closing connection

lint was complaining

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Log middleware now accepts a list of headers to exclude

Previously the log middleware would exclude printing only the headers
"Cookie", "Authorization" and "X-Csrf-Token". I believe the users of
this library may want to ommit more headers than those.

To do so I've changed how the middleware is initialize to optionally add
more headers to the list.

Also I've added a unit test to make sure the behavior works as expected.

Also I've added an optional configuration parameter to the server to set
LogRequestHeaders from there. This parameter was already an option in
the log middleware but was not exposed in the server.

Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>

* Enable native histograms for the request_duration_seconds histogram

This is an opt-in via setting Config.MetricsNativeHistogramFactor so
that nobody will expose native histograms by accident. Even if native
histograms are enabled, the conventional histograms are still
maintained and presented as before to scrapers incapabale of scraping
native histograms.

This commit also includes an update of prometheus/client_golang to
v1.14.0 as that is the earliest release supporting native histograms.

Native histograms were tested previously in the sparse-histograms
branch without any problems. This commit obsoletes the
sparse-histograms branch.

The histograms request_message_bytes and response_message_bytes are
left as conventional histograms for now. Maybe they would work well as
native histograms with a bucket factor of 2, as high resolution is not
the aim here. But we can decide about that once we have gathered more
experience with the native histograms for request_duration_seconds.

The settings for limiting the bucket count
(i.e. NativeHistogramMaxBucketNumber = 100 and
NativeHistogramMinResetDuration = 1h) are both conservative and most
likely good enough for the expected use cases. We can make them
configurable later, if users feel the need. The idea here is to not
add too many knobs prematurely and thereby avoid confusion.

Signed-off-by: beorn7 <beorn@grafana.com>

* update grpc

* Add Unwrap method to http.ResponseWriter implementations. (#283)

Also update httpsnoop to version 1.0.3, which supports Unwrap method as well.

* Add description to command line flags (#287)

Add description to log-request command line flags

* Add support to route both GRPC and HTTP over the HTTP server (#288)

* add support for grpc on the http port

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Make DisableRequestSuccessLog configurable. (#284)

- NewLogMiddleware is a public method. Adding a new parameter would make this PR a breaking change one.
- However, the behavior is the same: whatever is configured for the existing DisableRequestSuccessLog,
it will be used by the log middleware.
- Test option to not log successful requests.

* Allow users to manually register metrics

* Clean up some linter warnings

* grpc: use errors.Is to check if error is Canceled

This improves behaviour when one error wraps another.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Make gRPC logging optional via a custom interface (#299)

* middleware: add OptionalLogging interface with method ShouldLog

E.g. if the error is caused by overload, then we don't want to log it
because that uses more resource.

* Add test for gRPC logging, patterned after the one for http logging.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Add provenance and license comments to files migrated from weaveworks/common, update package names

* Update imports to match new package paths

* Fix linting issues.

* Fix package paths in protobuf descriptors and regenerate with gogo/protobuf

* Fix linting issues in migrated code.

* Remove dependency on deprecated package

* Replace use of `golang.org/x/net/context` with `context`

* Replace use of `prometheus` package with `promauto` where possible

* Add changelog entry.

---------

Signed-off-by: Tom Wilkie <tom.wilkie@gmail.com>
Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
Signed-off-by: Neeraj Poddar <neeraj@aspenmesh.io>
Signed-off-by: Xiang Dai <764524258@qq.com>
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: Thore Kruess <thore@kruess.xyz>
Signed-off-by: Jacob Lisi <jacob.t.lisi@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Michel Hollands <michel.hollands@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Jack Baldry <jack.baldry@grafana.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Steve Simpson <steve.simpson@grafana.com>
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Christian Simon <simon@swine.de>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Co-authored-by: Tom Wilkie <tomwilkie@users.noreply.github.com>
Co-authored-by: Marcus Cobden <marcus@marcuscobden.co.uk>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Goutham Veeramachaneni <gouthamve+github@gmail.com>
Co-authored-by: Roland Schilter <roli@schilter.me>
Co-authored-by: Adam Shannon <adamkshannon@gmail.com>
Co-authored-by: Tom Wilkie <tom.wilkie@gmail.com>
Co-authored-by: Anthony Woods <awoods@grafana.com>
Co-authored-by: Neeraj Poddar <neeraj@aspenmesh.io>
Co-authored-by: Bryan Boreham <bryan@weave.works>
Co-authored-by: Xiang Dai <764524258@qq.com>
Co-authored-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Co-authored-by: Thore <thore@kruess.xyz>
Co-authored-by: Jacob Lisi <jacob.t.lisi@gmail.com>
Co-authored-by: Cyril Tovena <cyril.tovena@gmail.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Aditya C S <aditya.gnu@gmail.com>
Co-authored-by: Sean Liao <seankhliao@gmail.com>
Co-authored-by: Annanay <annanayagarwal@gmail.com>
Co-authored-by: Goutham Veeramachaneni <gouthamve@gmail.com>
Co-authored-by: Roli Schilter <roli@weave.works>
Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
Co-authored-by: Michel Hollands <michel.hollands@grafana.com>
Co-authored-by: Peter Štibraný <peter.stibrany@grafana.com>
Co-authored-by: Jack Baldry <jack.baldry@grafana.com>
Co-authored-by: Daniel Holbach <daniel@weave.works>
Co-authored-by: Michel Hollands <42814411+MichelHollands@users.noreply.github.com>
Co-authored-by: beorn7 <beorn@grafana.com>
Co-authored-by: Chance Zibolski <czibolski@rigetti.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Robert Fratto <robertfratto@gmail.com>
Co-authored-by: Steve Simpson <steve.simpson@grafana.com>
Co-authored-by: 3Xpl0it3r <shouc.wang@hotmail.com>
Co-authored-by: Dave Henderson <dhenderson@gmail.com>
Co-authored-by: Danny Kopping <danny.kopping@grafana.com>
Co-authored-by: Christian Simon <simon@swine.de>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: colin-stuart <colindonstuart@gmail.com>
Co-authored-by: Victor Cinaglia <victorcinaglia@gmail.com>
Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>
Co-authored-by: zenador <zenador@users.noreply.github.com>
Co-authored-by: Kaviraj <kavirajkanagaraj@gmail.com>
Co-authored-by: Susana Ferreira <susana.ferreira@grafana.com>
Co-authored-by: Juraci Paixão Kröhling <juraci@kroehling.de>
Co-authored-by: Stefan Prodan <stefan.prodan@gmail.com>
Co-authored-by: Justin Lei <97976793+leizor@users.noreply.github.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Sebastian Rabenhorst <sebastian.rabenhorst@shopify.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Co-authored-by: Alan Protasio <alanprot@gmail.com>
Co-authored-by: Joe Elliott <number101010@gmail.com>
Co-authored-by: Dylan Guedes <djmgguedes@gmail.com>
Co-authored-by: Piotr Gwizdala <17101802+thampiotr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants