Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't forward empty span messages to log #607

Closed
sagebind opened this issue Feb 27, 2020 · 7 comments · Fixed by #613
Closed

Don't forward empty span messages to log #607

sagebind opened this issue Feb 27, 2020 · 7 comments · Fixed by #613

Comments

@sagebind
Copy link

sagebind commented Feb 27, 2020

I am currently experimenting with migrating my library Isahc from the log crate to tracing in order to get better per-request tracing, but the migration is not as seamless as I was hoping for (though I mostly like what I see). One large problem is how the log feature on the tracing crate handles entering and exiting spans; as alibrary author, I want to maintain the ability to output quality logs via the log abstraction for users using that crate, while also delivering additional value for users of tracing.

Unfortunately, migrating to tracing reduces the "quality" of logs emitted through the log abstraction because it is filled with a lot of mostly empty messages indicating when spans are entered, exited, and closed:

[2020-02-27T04:56:48Z INFO  isahc::agent] <- dispatch
[2020-02-27T04:56:48Z INFO  isahc::agent] -- dispatch
[2020-02-27T04:56:48Z INFO  isahc::agent] poll_messages; 
[2020-02-27T04:56:48Z INFO  isahc::agent] -> poll_messages
[2020-02-27T04:56:48Z INFO  isahc::agent] <- poll_messages
[2020-02-27T04:56:48Z INFO  isahc::agent] -- poll_messages
[2020-02-27T04:56:48Z INFO  isahc::agent] dispatch; 
[2020-02-27T04:56:48Z INFO  isahc::agent] -> dispatch
[2020-02-27T04:56:48Z INFO  isahc::agent] <- dispatch
[2020-02-27T04:56:48Z INFO  isahc::agent] -- dispatch
[2020-02-27T04:56:48Z INFO  isahc::agent] poll_messages; 
[2020-02-27T04:56:48Z INFO  isahc::agent] -> poll_messages
[2020-02-27T04:56:48Z INFO  isahc::agent] <- poll_messages
[2020-02-27T04:56:48Z INFO  isahc::agent] -- poll_messages
[2020-02-27T04:56:48Z INFO  isahc::agent] dispatch; 
[2020-02-27T04:56:48Z INFO  isahc::agent] -> dispatch

(Here the logger implementation is env_logger, but a user could use any implementation. Also keep in mind that this just from an experimental branch.)

In the context of a tracing subscriber, this structured information is of course quite valuable and can be presented in very useful ways. But in the context of a log logger, these logs seem pretty spammy and useless; since the fields of the span are not included on enter/exit, the only value that might come from this is the timestamps, but even that seems like a stretch to me.

My gut reaction to this behavior was that maybe all of my non-public methods should be using Trace-level spans so that users wouldn't get this spam unless they set their log level to trace. I often recommend to users to send me snippets of their trace logs whenever they think they are encountering a bug, but even then these log lines seem like they still would be just noise and not helpful.

I do see that the #[instrument] attribute optionally takes a level, but if spans should be trace everywhere except public methods, why is info the default level?

A similar problem is the logs when spans are created. Emitting a log when a span is created can be quite useful, except when it has no fields; in that case maybe spans with no fields should not be logged.

Ultimately I'm looking to benefit from tracing's more structured approach for both me when debugging concurrent behaviors and for users making use of tracing in their apps, but without giving existing log users the short shrift. I might be using tracing wrong or have the wrong expectations here and I just need to do some more learning first. Maybe there's a good reason for this behavior that I'm not grasping?

@hawkw
Copy link
Member

hawkw commented Feb 27, 2020

Unfortunately, migrating to tracing reduces the "quality" of logs emitted through the log abstraction because it is filled with a lot of mostly empty messages indicating when spans are entered, exited, and closed:

The problem is that it's really hard to present the contextual data that spans represent to log in a way that works well. Since the log feature supports using tracing without a Subscriber to actually track span contexts, we can't really annotate the log records we emit for events with span data at all. Thus, I thought the enter/exit logs would be useful since they provide some context.

If that's not the case, I'm certainly open to revisiting or removing them, although we should probably make sure that there aren't other users of the log feature who do find these useful.

Ideally, I think we would want to have a better way of tracking span contexts & presenting that information to log users. However, I'd also like to avoid making that system too complex — it runs the risk of essentially reimplementing everything a tracing subscriber does. If you have any suggestions, I'd be happy to hear them!

Emitting a log when a span is created can be quite useful, except when it has no fields; in that case maybe spans with no fields should not be logged.

I'm definitely open to changing that so that span creation is only logged when there are fields, especially if we remove some of the other span-related logs!

@sagebind
Copy link
Author

sagebind commented Feb 27, 2020

The problem is that it's really hard to present the contextual data that spans represent to log in a way that works well.

I can see where you're coming from and that makes some sense, but I'd argue that if a user wants that contextual data, they should switch their output to a Subscriber instead of trying to pass it somehow to a log logger. Mainly because that data simply didn't exist when the library was using just log before, so it wasn't a problem.

@hawkw
Copy link
Member

hawkw commented Feb 27, 2020

if a user wants that contextual data, they should switch their output to a Subscriber instead of trying to pass it somehow to a log logger.

Sure, this is probably true...at the time, when I was writing the log integration, it seemed like a shame to not emit this information when we could do so.

Mainly because that data simply didn't exist when the library was using just log before, so it wasn't a problem.

This is true in the case of a library that previously used log adding tracing instrumentation (e.g., the situation you're in with isahc). But, in the case of a library which used tracing from day one, there could potentially be a lot of vital information in spans which would otherwise not be recorded at all.


Here's a thought: what if the span enter/exit/close log records had a different log target; maybe "tracing::span_events" or something? That way, they could be filtered out separately from the log records emitted by event, but you could enable that target to opt in. The span creation log records should probably have the default target for the module path they occurred in, at least when the span has fields, since users are more likely to want to see that information.

What do you think?

@sagebind
Copy link
Author

Interesting idea, I like the sound of that, it would work well for my own use-case at least.

@hawkw
Copy link
Member

hawkw commented Feb 28, 2020

Okay, I'll see if anyone else using the log integration has an opinion, and then we can make that change pretty easily. As a side note, @sagebind, thanks for the thoughtful discussion on this! I appreciate it!

@davidbarsky
Copy link
Member

Here's a thought: what if the span enter/exit/close log records had a different log target; maybe "tracing::span_events" or something? That way, they could be filtered out separately from the log records emitted by event, but you could enable that target to opt in. The span creation log records should probably have the default target for the module path they occurred in, at least when the span has fields, since users are more likely to want to see that information.

The more I think about it, the more I like this approach. It can also be a way of tackling/validating two independent issues—this proposed behavioral change and moving Hyper/h2 to tracing. By moving span-related events/data to a dedicated target, log users who are using tracing-instrumented libraries don't need to aware of tracing-specific data unless they opt into a specific target. Similarly, tracing users aren't aware that anything changed.


I think this also has the nice side effect of (maybe!) driving more attention to the customizability of targets in tracing and log, which is a highly underrated feature.

@najamelan
Copy link
Contributor

I second that. log backends like flexilogger let you set different log levels per target. Also it can even just be grepped out.

hawkw added a commit that referenced this issue Feb 28, 2020
## Motivation

Currently, the `tracing` crate's "log" feature always logs span
lifecycle (creation/enter/exit/close) events, even when the span has no
fields. This can be quite noisy. Since the enter, exit, and close logs
don't include the span's fields, it doesn't add that much additional
context to the surrounding log records.

## Solution

In order to continue recording this data while improving the
signal-to-noise ratio of `tracing`-generated logs, this commit changes
the log integration to log span lifecycle events under a separate
target, "tracing::span". This way, log users can opt in to this data
when it is desired, but logs are generally much less noisy. Span
creation is still logged to the default target when the span has fields,
as this often provides useful information, but is sent to the
"tracing-span" target when there are no fields.

Fixes #607
hawkw added a commit that referenced this issue Feb 28, 2020
## Motivation

Currently, the `tracing` crate's "log" feature always logs span
lifecycle (creation/enter/exit/close) events, even when the span has no
fields. This can be quite noisy. Since the enter, exit, and close logs
don't include the span's fields, it doesn't add that much additional
context to the surrounding log records.

Additionally, when using `tracing` with the `log-always` feature and a
subscriber is active, the IDs of enabled spans are available, but not
included in the span lifecycle `log` records. These could be used to
enable tracking individual span's lifecycles via these log records, but
this is not currently possible.

## Solution

In order to continue recording this data while improving the
signal-to-noise ratio of `tracing`-generated logs, this branch changes
the log integration to log span lifecycle events under a separate
target, "tracing::span". This way, log users can opt in to this data
when it is desired, but logs are generally much less noisy. Span
creation is still logged to the default target when the span has fields,
as this often provides useful information, but is sent to the
"tracing-span" target when there are no fields.

This branch also updates the `log` integration so that when a span is
enabled by a subscriber, and `log` integration is enabled, that span's
`log` records will include its ID. Now, when multiple spans with the
same name are in the logs, we have a rudimentary ability to correlate
those span's lifecycle events, without introducing so much complexity
that just using a `tracing` subscriber would be a better solution.

For example, when a subscriber is assigning IDs to spans, the logs look
like this:

```
ERROR log_with_trace foo=5
ERROR log_with_trace foo=42
ERROR log_with_trace foo=39
WARN  log_with_trace hello world;
INFO  log_with_trace hello world; thingy=42 other_thingy=666
TRACE tracing::span ++ foo; span=1
TRACE tracing::span -> foo; span=1
TRACE log_with_trace hello san francisco; foo=3 bar=4
TRACE tracing::span <- foo; span=1
TRACE tracing::span -- foo; span=1
TRACE log_with_trace ++ foo; bar=3 baz=false; span=2
TRACE tracing::span -- foo; span=2
TRACE log_with_trace hello world foo=1 bar=2
```

Finally, I've added a new section to the `tracing` crate's documentation
to provide details on using the `log` integration features. 

Fixes #607

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants