-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exemplar: Can not set partial response. #4676
Comments
if I add a flag and pass it to NewExemplarsHandler, https://github.com/thanos-io/thanos/blob/main/pkg/api/query/v1.go#L798, thanos-query run OOM when query. may be too many exemplars? could we set a number limit to it? pprof heap is here
|
So actually partial response in the Exemplars API works, right? @hanjm Can you help confirm this? If the flag works, then let's rename the issue to discuss the exemplars limit. |
…thanos-io#4676) Signed-off-by: hanjm <hanjinming@outlook.com>
@yeya24 I found seems missing exemplar.partial-response flag, so i add exemplar.partial-response flag in hanjm@681608c, Partial response works in my brach. |
I see. @hanjm Looks like we have this config in the struct but forget to add a flag for it. Would you like to open a pr for it? |
ok. |
I am investigate it. then i add a debug log to (*exemplarsServer).Send https://github.com/thanos-io/thanos/blob/main/pkg/exemplars/exemplars.go#L42 func (srv *exemplarsServer) Send(res *exemplarspb.ExemplarsResponse) error {
if res.GetWarning() != "" {
err := errors.New(res.GetWarning())
log.Printf("err message size: errors:%d, srv.warnings:%d, res.GetWarning():%d",
len(err.Error()),
len(srv.warnings),
len(res.GetWarning()))
srv.warnings = append(srv.warnings, err)
return nil
} it will print a lot of logs like
|
then i print svr.warnings first ten message. func (srv *exemplarsServer) Send(res *exemplarspb.ExemplarsResponse) error {
if res.GetWarning() != "" {
err := errors.New(res.GetWarning())
if len(srv.warnings) == 100 {
log.Printf("err message size: errors:%d, srv.warnings:%d, res.GetWarning():%d, srv.warnings:%+v",
len(err.Error()),
len(srv.warnings),
len(res.GetWarning()),
srv.warnings[:10],
)
}
srv.warnings = append(srv.warnings, err)
return nil
} it print a log like
|
Seems it is better if keep one |
Hi. The version of thanos has been rolled back to 0.22. In addition, in the current version, --query-frontend.downstream-url uses the load balancing mechanism, and the load balancing mechanism is weighted round-robin. No matter how the client's IP changes. The upstream of the backend will only be routed to the same IP. thanos-query-frontend host resource |
I found the root cause: If the store-api target not implement exemplar API, err is
https://github.com/thanos-io/thanos/blob/main/pkg/exemplars/proxy.go#L195
|
…etsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) Signed-off-by: hanjm <hanjinming@outlook.com>
…/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) Signed-off-by: hanjm <hanjinming@outlook.com>
…/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) Signed-off-by: hanjm <hanjinming@outlook.com>
…thanos-io#4676) Signed-off-by: hanjm <hanjinming@outlook.com>
@MrYueQ Seems not relevent? Please feel free to open another issue ~ |
* Sidecar: Fix process external label on promethues v2.28+ use units.Bytes config type (#4657) * Sidecar: Fix process external label when promethues v2.28+ use units.Bytes config type (#4656) Signed-off-by: hanjm <hanjinming@outlook.com> * E2E: Upgrade prometheus image version Signed-off-by: hanjm <hanjinming@outlook.com> * upgrade Prometheus dependency version to v2.30.0 (#4669) * upgrade Prometheus dependency version to v2.30.0 Signed-off-by: Ben Ye <ben.ye@bytedance.com> * fix unit test Signed-off-by: Ben Ye <ben.ye@bytedance.com> # Conflicts: # go.mod # go.sum * Query: Fix (*exemplarsStream).receive/(*metricMetadataStream).receive/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (#4676) (#4681) Signed-off-by: hanjm <hanjinming@outlook.com> * Cut 0.23.0-rc.1 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Jimmiehan <hanjinming@outlook.com> Co-authored-by: Ben Ye <yb532204897@gmail.com>
* Cut release 0.23.0-rc.0 (#4625) Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Updated version. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Cut 0.23.0-rc.1 and cherry picked 3 critical commits from main. (#4684) * Sidecar: Fix process external label on promethues v2.28+ use units.Bytes config type (#4657) * Sidecar: Fix process external label when promethues v2.28+ use units.Bytes config type (#4656) Signed-off-by: hanjm <hanjinming@outlook.com> * E2E: Upgrade prometheus image version Signed-off-by: hanjm <hanjinming@outlook.com> * upgrade Prometheus dependency version to v2.30.0 (#4669) * upgrade Prometheus dependency version to v2.30.0 Signed-off-by: Ben Ye <ben.ye@bytedance.com> * fix unit test Signed-off-by: Ben Ye <ben.ye@bytedance.com> # Conflicts: # go.mod # go.sum * Query: Fix (*exemplarsStream).receive/(*metricMetadataStream).receive/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (#4676) (#4681) Signed-off-by: hanjm <hanjinming@outlook.com> * Cut 0.23.0-rc.1 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Jimmiehan <hanjinming@outlook.com> Co-authored-by: Ben Ye <yb532204897@gmail.com> * Cut 0.23.0 release. (#4697) * Endpointset: Do not use info client to obtain metadata (for now) (#4714) * Do not use info client to obtain metadata Signed-off-by: Matej Gera <matejgera@gmail.com> * Update CHANGELOG. Signed-off-by: Matej Gera <matejgera@gmail.com> * Comment out client.info usage Signed-off-by: Matej Gera <matejgera@gmail.com> * Fix lint error Signed-off-by: Matej Gera <matejgera@gmail.com> * Cutting 0.23.1 (#4718) Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Moved tutorials Thanos versions to 0.23.1 Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> * Added volounteer for shepharding, fixed VERSION. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com> Co-authored-by: Jimmiehan <hanjinming@outlook.com> Co-authored-by: Ben Ye <yb532204897@gmail.com> Co-authored-by: Matej Gera <38492574+matej-g@users.noreply.github.com>
…/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) (thanos-io#4681) Signed-off-by: hanjm <hanjinming@outlook.com>
This cherry-picks upstream patch that fixes the bug Query: Fix (*exemplarsStream).receive/(*metricMetadataStream).receive/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) (thanos-io#4681) See: - thanos-io#4676 (comment) - thanos-io#4681 Signed-off-by: hanjm <hanjinming@outlook.com> (cherry picked from commit 2d4d140) Signed-off-by: Sunil Thaha <3005132+sthaha@users.noreply.github.com>
This cherry-picks upstream patch that fixes the bug Query: Fix (*exemplarsStream).receive/(*metricMetadataStream).receive/(*targetsStreamStream).receive infinite loop when target response Unimplemented error (thanos-io#4676) (thanos-io#4681) See: - thanos-io#4676 (comment) - thanos-io#4681 Signed-off-by: hanjm <hanjinming@outlook.com> (cherry picked from commit 2d4d140) Signed-off-by: Sunil Thaha <3005132+sthaha@users.noreply.github.com>
Thanos, Prometheus and Golang version used:
Object Storage Provider:
COS
What happened:
query_exemplar response error:
error: "retrieving exemplars: proxy Exemplars: receiving exemplars from exemplars client &{0xc000b2a000}: rpc error: code = Unimplemented desc = unknown service thanos.Exemplars"
What you expected to happen:
partial response with warning.
How to reproduce it (as minimally and precisely as possible):
Thanos Query 0.23 beta + a old version sidecar not support exemplar
Full logs to relevant components:
Anything else we need to know:
seems missing flag to control exemplar partial reponse
The text was updated successfully, but these errors were encountered: