Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

clean up tracing #1410

Merged
merged 1 commit into from
Jul 26, 2019
Merged

clean up tracing #1410

merged 1 commit into from
Jul 26, 2019

Conversation

woodsaj
Copy link
Member

@woodsaj woodsaj commented Jul 25, 2019

  • change many of the span tags to span logs. Having so many tags
    puts a lot of strain on jaeger. The span tags weren't something
    that we would every query by, so are better of being logs.
  • pass the trace context through when proxying requests to graphite (needs Update1.1.5 and forward jaeger headers graphite-mt#13 for the trace contexts to make it back to MT again)
  • set the nodatapoints tag on the root span, not the executePlan span.

- change many of the span tags to span logs. Having so many tags
  puts a lot of strain on jaeger.  The span tags weren't something
  that we would every query by, so are better of being logs.
- pass the trace context through when proxying requests to graphite.
- set the nodatapoints tag on the root span, not the executePlan span.
@woodsaj
Copy link
Member Author

woodsaj commented Jul 25, 2019

here is a nice screenshot showing a trace from a render request that used timeshift() function (forcing the request to be proxied to graphite). From the trace you can see that the request was proxied to graphite, which then made 2 subsequent render requests to MT, with the second request needed due to the timeshift.

Screenshot_2019-07-26_04-45-00

@woodsaj woodsaj merged commit df2734b into master Jul 26, 2019
@woodsaj woodsaj deleted the tracingFixes branch July 26, 2019 09:53
@Dieterbe
Copy link
Contributor

set the nodatapoints tag on the root span, not the executePlan span

Is there a particular advantage to this change ?

span.SetTag("expressions", t.Expr)
span.SetTag("from", t.From)
span.SetTag("limit", t.Limit)
span.SetTag("orgId", t.OrgId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be consistent and use lowercase orgid everywhere

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with consistency, but that would be to use orgId everywhere, which we are doing except in 1 place.

@Dieterbe
Copy link
Contributor

change many of the span tags to span logs. Having so many tags
puts a lot of strain on jaeger.

Interesting. Makes sense. How did you find this out / did this cause an impact ?

The span tags weren't something
that we would every query by, so are better of being logs.

Generally true, with perhaps a few exceptions such as targets / from / until which were intended (and have been used occasionally with mild success) to make it easier to find specific traces corresponding to a render request. We need a good way to go from a specific grafana request to the right trace, as long as traceid's are not available in grafana those seem useful as tags.

@woodsaj
Copy link
Member Author

woodsaj commented Jul 29, 2019

We need a good way to go from a specific grafana request to the right trace

I find that simply setting the jaeger time range to the 1 or 2 minute period, then use other tags such as orgId or nodatapoints=true, works good enough.

We need a good way to go from a specific grafana request to the right trace, as long as traceid's are not available in grafana those seem useful as tags.

TraceIds are currently available if requests are sent directly to mt-query. They are not available of tsdb-gw sends requests directly to graphite. This can be fixed by enabling traces on tsdb-gw (it needs to be updated to support all jeager config settings, eg JAEGER_AGENT_HOST), and upgrading all hm instances to use the new Graphite1.1.5 image.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants