-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util/tracing: simplify log messages in trace recordings #70502
Conversation
Please don't merge until #70562
is either discarded or merged, thanks! That PR will need backported if it
does get merged to master, and there will be time pressure.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Overall , just a couple minor nits and questions.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @abarganier and @andreimatei)
pkg/util/tracing/recording.go, line 93 at r1 (raw file):
""
is empty string needed here?
pkg/util/tracing/recording.go, line 203 at r1 (raw file):
for _, k := range tags { if !first { sb.WriteRune(' ')
nit: do we need this if we already wrote whitespace above on line 190?
if len(sp.Tags) > 0 {
sb.WriteRune(' ')
}
If so, instead of tracking first
here, can we just write the whitespace prior to the loop if len(tags) > 0
? Could further simplify the loop.
pkg/util/tracing/recording.go, line 363 at r3 (raw file):
Quoted 5 lines of code…
Fields: []jaegerjson.KeyValue{{ Value: l.Msg(), Type: "STRING", }},
I'm not experienced with Jaeger, so wanted to confirm - is it okay to omit a key here? Will it still display nicely in a jaeger UI without a key?
8560bf8
to
f05cd8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't merge until #70562 is either discarded or merged, thanks! That PR will need backported if it does get merged to master, and there will be time pressure.
Well, is #70562 discarded? I don't think this patch would get in your way.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier)
pkg/util/tracing/recording.go, line 93 at r1 (raw file):
Previously, abarganier (Alex Barganier) wrote…
""
is empty string needed here?
removed
pkg/util/tracing/recording.go, line 203 at r1 (raw file):
Previously, abarganier (Alex Barganier) wrote…
nit: do we need this if we already wrote whitespace above on line 190?
if len(sp.Tags) > 0 { sb.WriteRune(' ') }
If so, instead of tracking
first
here, can we just write the whitespace prior to the loopif len(tags) > 0
? Could further simplify the loop.
This was writing the spaces before every tag except the first. But you're right that this code was needlessly complicated. Simplified.
pkg/util/tracing/recording.go, line 363 at r3 (raw file):
Previously, abarganier (Alex Barganier) wrote…
Fields: []jaegerjson.KeyValue{{ Value: l.Msg(), Type: "STRING", }},
I'm not experienced with Jaeger, so wanted to confirm - is it okay to omit a key here? Will it still display nicely in a jaeger UI without a key?
It rendered fine. But on second thought, you're right; the log messages do show up with an "event" key when the spans are created and collected by Jaeger through OpenTelemetry. So I've given them the same key on this code path (this is the code path used by explain analyze (debug)
and friends), for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about #70562, but everything else !
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @abarganier)
f05cd8a
to
545ed0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 21 of 21 files at r7, 12 of 12 files at r8, 12 of 12 files at r9, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier and @andreimatei)
pkg/server/node_tenant.go, line 33 at r9 (raw file):
for j := range sp.Logs { record := &sp.Logs[j] record.Message = record.Message.Redact()
what's going on here?
- did you intend to edit Message in-place inside the span? Don't you want to take a copy?
- why does this commit need to add a Redact call? I thought Tobias' PR was already taking care of doing the right thing? Did we forget this call in the previous PR? (If so, we'll need to create a separate PR with just this fix and issue an emergency backport)
pkg/util/tracing/recording.go, line 81 at r7 (raw file):
// TODO(andrei): this should be unified with // SessionTracing.generateSessionTraceVTable(). func (r Recording) String() string {
You could have Recording implement SafeFormatter
and propagate the markers in the individual log entries.
pkg/util/tracing/recording.go, line 139 at r7 (raw file):
} // FindLogMessage returns the first log message in the recording that matches
Can you explain this function is used only in testing
(maybe even rename it to TestingFindLogMessage
)
and mention that it's relatively slow and heavy on allocations.
pkg/util/tracing/recording.go, line 212 at r7 (raw file):
for _, l := range sp.Logs { lastLog := ownLogs[len(ownLogs)-1] ownLogs = append(ownLogs, conv("event:"+l.Msg(), l.Time, lastLog.Timestamp))
ugh, is there a way to prepend that "event:"
sometime later, e.g. when logRecord
instances are formatted?
pkg/util/tracing/recording.go, line 226 at r7 (raw file):
lastLog := ownLogs[len(ownLogs)-1] var sb redact.StringBuilder sb.SafeString("structured:")
that looks like a regression. Prior to your change, in Tobias's PR we were assuming that structured payloads were safe. Now you're making this code consider the entire JSON thing as unsafe. Was that intnetional?
545ed0a
to
6828a2e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier, @knz, and @tbg)
pkg/server/node_tenant.go, line 33 at r9 (raw file):
did you intend to edit Message in-place inside the span? Don't you want to take a copy?
I did. I'm doing what the function was already doing - modifying the recording (this function doesn't return anything). I've added a comment to the function.
why does this commit need to add a Redact call?
The record.Message
field is new; that's what this commit does. It's a replacement for the old Fields
collection.
pkg/util/tracing/recording.go, line 81 at r7 (raw file):
Previously, knz (kena) wrote…
You could have Recording implement
SafeFormatter
and propagate the markers in the individual log entries.
ack. I'll leave for next time though.
pkg/util/tracing/recording.go, line 139 at r7 (raw file):
Previously, knz (kena) wrote…
Can you explain this function is used only in testing
(maybe even rename it toTestingFindLogMessage
)and mention that it's relatively slow and heavy on allocations.
Well the function is not particularly test specific... It's fairly natural for only tests to be interested in doing any sort of comprehension (or rather, sloppy reg-ex based comprehension) of trace recordings, but as far as the function itself is concerned, I don't see any big reason to keep it away from production code.
As to the performance of the thing, well, it is what it is. I don't really thinks it warrants a comment...
FWIW, this patch isn't really doing much to this function. To the extent that it does anything, it just makes the fact that there's a StripMarkers
in there more visible by lifting it out of the LogRecord.Msg()
method.
pkg/util/tracing/recording.go, line 212 at r7 (raw file):
Previously, knz (kena) wrote…
ugh, is there a way to prepend that
"event:"
sometime later, e.g. whenlogRecord
instances are formatted?
Not without making things more complicated. All of the span start message, the log message, and the structured messages are represented as the same logRecord
type, which doesn't encode the difference between them. It's not worth doing too much for this function, it's only used by Recording.String()
.
Perhaps you were taking objection to using +
between a literal and a RedactableString
, since it wasn't obvious what will happen to the literal? See now, I've turned it into a sb.Printf
.
pkg/util/tracing/recording.go, line 226 at r7 (raw file):
Previously, knz (kena) wrote…
that looks like a regression. Prior to your change, in Tobias's PR we were assuming that structured payloads were safe. Now you're making this code consider the entire JSON thing as unsafe. Was that intnetional?
Well before my change, safe versus unsafe considerations did not apply to this function. This function is only used by Recording.String()
, and it returned opentracing.LogRecord
, which doesn't understand such things. All markers were removed from all messages; structured on unstructured. Now, this function returns redactable messages. It doesn't matter too much, since the only caller, Recording.String()
, strips all the markers anyway.
As far as the safety of structured messages, I'm not sure it has been determined that they're indeed safe. We can't strip them at the boundary between KV and a tenant, because SQL needs them. I'm not sure they're all safe since, for example, a ContentionEvent
has a key in it, and I think that key can come from a meta range. I think I raised this on Slack a few days ago and I don't think I heard back. Anyway as far as I can tell, I don't think we "show" these events to users (e.g. I don't think they show up for show trace for session
). Perhaps @tbg can chime on whether there's been thinking around them.
Anyway, for the purposes of this code here, I think there's no harm in considering them unsafe, and also it doesn't matter because again the caller strips all the markers anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 12 files at r11, 13 of 14 files at r12, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier, @andreimatei, and @knz)
pkg/util/tracing/recording.go, line 139 at r7 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
Well the function is not particularly test specific... It's fairly natural for only tests to be interested in doing any sort of comprehension (or rather, sloppy reg-ex based comprehension) of trace recordings, but as far as the function itself is concerned, I don't see any big reason to keep it away from production code.
As to the performance of the thing, well, it is what it is. I don't really thinks it warrants a comment...FWIW, this patch isn't really doing much to this function. To the extent that it does anything, it just makes the fact that there's a
StripMarkers
in there more visible by lifting it out of theLogRecord.Msg()
method.
The reason why I pinged you on this is that I find it sad to re-call StripMarker
on this every time the logs are being traversed. That re-processing (and the allocs that go with it) is ok for test code, but not for production code.
pkg/util/tracing/recording.go, line 212 at r7 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
Not without making things more complicated. All of the span start message, the log message, and the structured messages are represented as the same
logRecord
type, which doesn't encode the difference between them. It's not worth doing too much for this function, it's only used byRecording.String()
.Perhaps you were taking objection to using
+
between a literal and aRedactableString
, since it wasn't obvious what will happen to the literal? See now, I've turned it into asb.Printf
.
your previous code was fine. I was just annoyed by the extra heap alloc and string copy implied here.
pkg/util/tracing/recording.go, line 226 at r7 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
Well before my change, safe versus unsafe considerations did not apply to this function. This function is only used by
Recording.String()
, and it returnedopentracing.LogRecord
, which doesn't understand such things. All markers were removed from all messages; structured on unstructured. Now, this function returns redactable messages. It doesn't matter too much, since the only caller,Recording.String()
, strips all the markers anyway.As far as the safety of structured messages, I'm not sure it has been determined that they're indeed safe. We can't strip them at the boundary between KV and a tenant, because SQL needs them. I'm not sure they're all safe since, for example, a
ContentionEvent
has a key in it, and I think that key can come from a meta range. I think I raised this on Slack a few days ago and I don't think I heard back. Anyway as far as I can tell, I don't think we "show" these events to users (e.g. I don't think they show up forshow trace for session
). Perhaps @tbg can chime on whether there's been thinking around them.
Anyway, for the purposes of this code here, I think there's no harm in considering them unsafe, and also it doesn't matter because again the caller strips all the markers anyway.
Ok I'm 100% on board with the idea that these structured events are generally unsafe until proven safe. As such, your current implementation is OK.
You can add a TODO here (and in the docstring of MessageToJSONString
) that we should update the json serializer to do the same as the structured events in eventpb
, where the JSON serializer ((EventPayload).AppendJSONFields
) does redactability separately per-field using field tags.
6828a2e
to
90611f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier and @knz)
pkg/util/tracing/recording.go, line 139 at r7 (raw file):
Previously, knz (kena) wrote…
The reason why I pinged you on this is that I find it sad to re-call
StripMarker
on this every time the logs are being traversed. That re-processing (and the allocs that go with it) is ok for test code, but not for production code.
ok, added an efficiency note
pkg/util/tracing/recording.go, line 212 at r7 (raw file):
Previously, knz (kena) wrote…
your previous code was fine. I was just annoyed by the extra heap alloc and string copy implied here.
well, once I looked at it, I didn't like the old code because to me it wasn't clear at all if the "event"
part was going to be considered safe or not. I think it was going to be considered safe, but I think that's more a bug/limitation of the type system rather than a good thing, no? What I have now is more explicit and better, isn't it?
pkg/util/tracing/recording.go, line 226 at r7 (raw file):
Previously, knz (kena) wrote…
Ok I'm 100% on board with the idea that these structured events are generally unsafe until proven safe. As such, your current implementation is OK.
You can add a TODO here (and in the docstring of
MessageToJSONString
) that we should update the json serializer to do the same as the structured events ineventpb
, where the JSON serializer ((EventPayload).AppendJSONFields
) does redactability separately per-field using field tags.
Added the TODO to MessageToJSONString
.
90611f8
to
f900598
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 15 files at r15.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier, @andreimatei, and @knz)
pkg/util/tracing/recording.go, line 212 at r7 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
well, once I looked at it, I didn't like the old code because to me it wasn't clear at all if the
"event"
part was going to be considered safe or not. I think it was going to be considered safe, but I think that's more a bug/limitation of the type system rather than a good thing, no? What I have now is more explicit and better, isn't it?
this is good too.
Some utilities were using the clunky opentracing.LogRecord for no reason. Release note: None
Release note: None
Before this patch, the RecordedSpan proto stored log messages in a very awkward way: each message was stored as a collection of key/values, with only one such pair present (using a well-known key). This was confusing, unnecessary, hard to work with and hard to track for figuring out what keys and values are in there (with the answer being only one key). This patch simplifies the log messages, making them be represented by a single string as nature intended. A bunch of code gets simplified in consequence. Release note: None
f900598
to
6a3af83
Compare
bors r+ |
🕐 Waiting for PR status (Github check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set. |
GitHub status checks took too long to complete, so bors is giving up. You can adjust bors configuration to have it wait longer if you like. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier, @andreimatei, and @knz)
Build failed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#70896 for the failure
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier, @andreimatei, and @knz)
Build failed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failure is #70659
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier, @andreimatei, and @knz)
Build failed (retrying...): |
Build succeeded: |
Before this patch, the RecordedSpan proto stored log messages in a very
awkward way: each message was stored as a collection of key/values, with
only one such pair present (using a well-known key). This was confusing,
unnecessary, hard to work with and hard to track for figuring out what
keys and values are in there (with the answer being only one key). This
patch simplifies the log messages, making them be represented by a
single string as nature intended. A bunch of code gets simplified in
consequence.
Release note: None