-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QueryRange use protobuf internally instead of json to reduce latency #3745
Conversation
…of json for improved performance
@@ -438,7 +439,10 @@ func (s *queryRangeSharder) generatorRequest(searchReq tempopb.QueryRangeRequest | |||
|
|||
searchReq.QueryMode = querier.QueryModeRecent | |||
|
|||
return s.toUpstreamRequest(parent.Context(), searchReq, parent, tenantID) | |||
req := s.toUpstreamRequest(parent.Context(), searchReq, parent, tenantID) | |||
req.Header.Set(api.HeaderAccept, api.HeaderAcceptProtobuf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add this proto header in prepareRequestForQueriers()
?
this feels like a few changes away from just making this the default relationship between the queriers and frontend.
also, fantastic find! agree with the analysis this is largely cropping up due to the streaming frontend refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, when we make this universal that is the ideal place to add the header. And on the querier side the writeFormattedContentForRequest(w, r, resp)
is also easy to incorporate. I'm on the fence to do it in this PR, since splitting out the remaining work would be a nice opportunity to get others get involved. But happy to do it if that's not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my .02: just do them all for consistency and simplicity
i don't disagree with your thoughts though if you decide to just merge
…3745) * Update query_range to use protobuf between frontend->querier instead of json for improved performance * changelog
What this PR does:
Similar to #3731 but for query_range. For high cardinality metrics queriers on larger clusters, the marshaling/unmarshaling of json was becoming the bottleneck in the query-frontend. Metrics queries have larger and more numerous jobs than autocomplete, so the improvement is even better. I was thinking about why this was just now being noticed, since autocomplete and query_range have existed for some time and the request/response formats haven't changed. I believe it is due to the recently added asynchronous frontend pipeline. Two changes (1) It is more efficient at issuing jobs, so the bottleneck in the frontend from unmarshaling is now readily apparent (2) the unmarshaling was moved to a single goroutine, which will be restored in #3713 however this PR is more impactful because it greatly reduces the overall amount of work done by the frontend (versus spreading it out among more cpu cores).
Here is another graph showing the improvement for the query
{ } | rate() by (resource.service.name)
of 3h time range, from ~80s to 20s (all cluster and request parameters the same).NOTE - I only did this for the rf1 read path.
Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]