Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Tracing support #709

Merged
merged 7 commits into from
Aug 10, 2017
Merged

Tracing support #709

merged 7 commits into from
Aug 10, 2017

Conversation

Dieterbe
Copy link
Contributor

@Dieterbe Dieterbe commented Aug 8, 2017

a
b

@woodsaj
Copy link
Member

woodsaj commented Aug 9, 2017

This is looking good @Dieterbe
one question, are we going to be able to correlate the remote calls back to the originating caller. ie
for a render call that times out in GetTargetsRemote, are we going to be able to dive in and see a trace of getData on each MT instance?

@@ -37,6 +37,9 @@ func getOrg(req *http.Request, multiTenant bool) (int, error) {
}
orgStr := req.Header.Get("x-org-id")
if orgStr == "" {
orgStr = req.Header.Get("x-grafana-org-id")
}
if orgStr == "" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@woodsaj thoughts on this? I did it because grafana uses that header now, and to make the dev stack work where grafana talks to MT directly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with this. We can also pass this through graphite-web if you like, just need to update the "REMOTE_STORE_FORWARD_HEADERS in local_settings.py

https://github.com/raintank/deployment_tools/blob/master/metrictank/image/graphite-web/local_settings.py#L135

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x-grafana-org-id is just going to be the internal id of the org within the grafana instance, it won't have any correlation with the grafana.com org id

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Aug 9, 2017

one question, are we going to be able to correlate the remote calls back to the originating caller. ie
for a render call that times out in GetTargetsRemote, are we going to be able to dive in and see a trace of getData on each MT instance?

even better I think. the way this is supposed to work is that the trace context gets sent along with the getData RPC call, so the peer MT can effectively add its own logs and trace items into the "parent" context from the caller, and it'll show up as 1 tree in the jaeger UI.

i still have to get this to work, but that's how it should be.

@Dieterbe Dieterbe force-pushed the tracing branch 2 times, most recently from 7bc2133 to e63cc7c Compare August 9, 2017 22:08
@Dieterbe
Copy link
Contributor Author

here's a screenie of the current implementation, demonstrating the above.
bonus : i added in a fake error to make it more interesting
remote-err

@Dieterbe Dieterbe changed the title WIP Tracing Tracing support Aug 10, 2017
r.TraceLog(span)
}
}

Copy link
Contributor Author

@Dieterbe Dieterbe Aug 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suspect this may be overdoing it. for every request in a GetData RPC call, log all the request properties. this could be too much overhead. but we can leave it until we see perf trouble

@@ -27,6 +31,14 @@ func Write(w http.ResponseWriter, resp Response) {
return
}

func WriteErr(ctx context.Context, w http.ResponseWriter, resp Response) {
Copy link
Member

@woodsaj woodsaj Aug 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

every call to WriteErr is passing writeErr(ctx.Req.Context(), ctx, error)
so why not just use

func WriteErr(c *macaron.Context, resp Response) {
  ctx := c.Req.Context()
  .....
}

It may even be a better idea to just include the error logger in the Write() method, eg.

func Write(ctx *macaron.Context, resp Response) {
	defer resp.Close()
	body, err := resp.Body()
	if err != nil {
		w.WriteHeader(http.StatusInternalServerError)
		w.Write([]byte(err.Error()))
		return
	}
	for k, v := range resp.Headers() {
		w.Header().Set(k, v)
	}
	w.WriteHeader(resp.Code())
	w.Write(body)

	// log errors
	if errResp, ok := resp.(Error); ok { // or maybe "if resp.Code() >= 400"
		span := opentracing.SpanFromContext(ctx)
		body, _ := resp.Body()
		span.LogFields(log.String("error.kind", string(body)))
		tags.Error.Set(span, true)
	}
}```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked that the response package was pretty standalone and not tied into macaron, and wanted to keep it that way.

@@ -0,0 +1,38 @@
package conf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this package called "conf" when all handles is tracing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conf is not new. https://github.com/raintank/metrictank/tree/master/conf
before it was just used for extra config related things like storage-schemas and aggregation rules config, and now also for tracing config/setup. I thought about putting it in a tracing package but it would be a tiny package. though perhaps that's the better way.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants