consider eliding span.sync=false because it is the typical case #1881

trentm · 2020-11-18T16:38:44Z

tl;dr: Setting span.sync to true or false on every APM span results in unhelpful UI noise in Kibana. Let's only set it when it is "interesting". For Node.js the typical case is that spans are async (most traced work span event loop async operations). That leaves "span.sync==true" as the "interesting" case.

The span.sync field can be true, false, null or not set per the api-server spec.

Here is the Kibana code that adds a SyncBadge to spans in the Waterfall view and the code for the SyncBadge element. I.e. sync: true will showing "blocking" in the ui, sync: false will show "async", sync: null (or not specified) will show nothing. E.g.:

From chat discussion (this came up when discussing #1878):

"Felix
I remember that we’ve talked about the usefulness of having all spans marked as sync=false in Node.js. Setting sync=false makes the UI add a async flag next to the span name. But as async is the default in Node.js, it would probably make more sense to not set the sync flag in the default async case but only set sync=true when there’s a blocking operation, which makes the UI show a blocking label. This is what the RUM agent is doing."

The text was updated successfully, but these errors were encountered:

trentm · 2020-11-18T16:39:52Z

TODOs to consider:

fix this issue in the node.js APM agent
doc "span.sync" in apm.git specs

smith · 2020-11-18T16:51:07Z

I think the agent should be sending the most true information, so if a span is async the agent should be sending us that information. Whether or not it should be displayed in the UI is a UI issue. We can decide in the UI whether or not to show that data. IMO.

trentm · 2020-11-18T17:02:02Z

@smith Hi! Quoting @sqren from some earlier discussion:

"As you can see from that slack thread this decision was not communicated super well originally so agents have implemented this differently. But the intention is still for everyone to align on only setting span.sync to true or false when the async mode is different from the default mode of the language.

Søren Louv-Jansen 7 hours ago
So for node.js you’d never set it to false (because that’s the default mode of javascript). Instead you’d occasionally set it to true when that is the case"

What would be a good place to capture this intent? In the APM specs? In the APM UI docs?

smith · 2020-11-18T17:05:29Z

Yeah so I'd disagree there but ok. "The default mode of the language" is ambiguous. If I'm using Twisted in Python or EventMachine in Ruby the "default mode" is async but the "default mode of the language" is sync.

trentm · 2020-11-18T17:17:54Z

"The default mode of the language" is ambiguous.

I agree with that. I'd change "default mode" to something like "what is typical". Speaking only for node.js, it is pretty easy to say "async is almost always the typical case". Less so for Python, as you point out. If every span is marked sync=true or false I don't know how the UI can reasonably determine if it notable enough to have a "blocking" or "async" badge -- which is why an intention for the APM Agent to decide whether to set span.sync=<bool> makes some sense to me.

sorenlouv · 2020-11-18T17:45:46Z

Yeah so I'd disagree there but ok. "The default mode of the language" is ambiguous. If I'm using Twisted in Python or EventMachine in Ruby the "default mode" is async but the "default mode of the language" is sync.

Isn't that a good example of why we should leave it up to the agent to specify what is "typical" and what is not? Always specifying span.sync will require the UI to determine what is typical for every framework across every language. I trust the agent developers to have better domain knowledge than we do.

astorm · 2020-11-18T20:19:21Z

Great discussion so far all -- just coming in with my two cents:

Short Version: The Node agent's current intended behavior -- which is to start a span as sync=true and to mark it with sync=false as soon as we detect that the span has ended in an asynchronous context other than the one it started in (that's a mouthful! :)) -- feels like the correct behavior. This whole conversation kicked off because of a bug in this intended behavior -- once we close that bug I think we'll be good. @felixbarny -- this is probably another area where a cross agent spec on how agents report on threads vs. single process async vs. sync code would be a useful long term thing to have.

Long Version: I'm aligned with @smith in that it's an agent's job to represent The Truth of what's happened to the best of its ability. My experience elsewhere tells me that if we start seeing agent data as flags for particular pieces of the UI/UX we're eventually going to end up in a situation where our implementation is tied too heavily to one UI/UX and not to another. It's better to have the agent represent what actually happened to the best of it's ability.

Also -- I'd also lightly push back on the idea that Node's default behavior is async. It's true that the community of programmers around Node.js encourages asynchronous programming patterns. However by default every line of code in a Node.js program executes synchronously. It's not until a programmer taker proactive action (or the library they're using-blindly takes proactive active) to schedule something asynchronously that a span becomes async.

(waves hands like a dessert hermit yes, I know, worker threads complicate this a bit -- let's save that discussion for another day -- these are not the bleeding edge features you are looking for)

Also, regarding UI noise -- seeing async or sync next to every span may be noise to experienced javascript engineers working at a tech first company and programming in javascript/typescript for 95% of the day -- but I'm not sure that's our typical/ideal user. Folks are coming to these charts when there's problems. They're often not the engineers who wrote the code -- or when they are they're often developers who need to work in multiple different languages to keep their company's systems up and running. It seems like there's a value to being explicit about whether a span is considered async or not -- if only to help orient folks working with multiple-language environments.

sorenlouv · 2020-11-18T20:30:17Z

It seems like there's a value to being explicit about whether a span is considered async or not

This is how the conversation started and the thinking was that it was valuable to display when it was "atypical" but noise to always show it. Particularly it becomes hard to spot the atypical. I still think that's the case but if we get feedback from users that they lack this information I'm good with always showing it.

sorenlouv · 2020-11-18T22:09:05Z

if we start seeing agent data as flags for particular pieces of the UI/UX we're eventually going to end up in a situation where our implementation is tied too heavily to one UI/UX and not to another. It's better to have the agent represent what actually happened to the best of it's ability.

Makes sense. I'm interested to hear the other agent devs take on this so I'd appreciate if you can bring it up with them.

sorenlouv · 2021-08-23T14:28:13Z

Reviving this old discussion. I have created elastic/kibana#109661 to solve this issue by hiding async badge from spans from the Node agent, however, the following makes me doubt if we are in agreement

I'd also lightly push back on the idea that Node's default behavior is async. It's true that the community of programmers around Node.js encourages asynchronous programming patterns. However by default every line of code in a Node.js program executes synchronously. It's not until a programmer taker proactive action (or the library they're using-blindly takes proactive active) to schedule something asynchronously that a span becomes async.

So the question is: are we happy with the current state (in which this issue and elastic/kibana#109661 can be closed immediately) or should we go ahead and implement elastic/kibana#109661 ?

trentm · 2021-08-23T18:11:36Z

[Alan]

Also -- I'd also lightly push back on the idea that Node's default behavior is async. ...

Alan, would you object to language that a "typical" Node.js traced span is async? I.e. we get away from using the term "default".

tl;dr: My votes are to (a) close this issue (i.e. have the Node.js APM agent report what it knows and not attempt to only report what it thinks is "interesting") and (b) continue with elastic/kibana#109661 (though I would say "when interesting", rather than "when relevant").

AFAICT the "sync" field isn't discussed in the APM agent specs at all. The Python async examples above show that it might be useful to at some point to have an cross-agent discussion clarifying span.sync. I don't think we need to get into it now, however. (FWIW, my naive read of the Python APM agent code is that span.sync is set False when asyncio is used, and otherwise is almost always left blank, i.e. almost never set True.)
I think the "async" badge in the UI on almost all spans in Node.js traces is visual clutter, so I think [APM] Only show span.sync badge when relevant kibana#109661 is a good idea.
Note that the "blocking" badge on all the ES spans in @sqren's screenshot in [APM] Only show span.sync badge when relevant kibana#109661 is a bug. Those ES spans are async. I hope to fix that in upcoming working.

sorenlouv · 2021-09-02T15:12:39Z

One thing to mention:

Also, regarding UI noise -- seeing async or sync next to every span may be noise to experienced javascript engineers working at a tech first company and programming in javascript/typescript for 95% of the day -- but I'm not sure that's our typical/ideal user.

The intention is to still show async/blocking badge unconditionally in the flyout (when clicking a span or transaction). So for people in doubt they can always get the information there.
However in the waterfall/timeline we have different things that compete for space and in that context the async/blocking badge is currently taking up too much space.
Another option is to hide the badge entirely on the waterfall and only show it in the flyout. That would certainly be easier but then we definitely won't be calling out "atypical" items.

trentm · 2021-09-02T17:55:04Z

My votes are to (a) close this issue ...

Closing. Alan had no objections.

trentm added the agent-nodejs Make available for APM Agents project planning. label Nov 18, 2020

trentm self-assigned this Nov 18, 2020

sorenlouv mentioned this issue Aug 23, 2021

[APM] Only show span.sync badge when relevant elastic/kibana#109661

Closed

trentm closed this as completed Sep 2, 2021

trentm mentioned this issue Jun 5, 2023

Add spec for 'span.sync' field elastic/apm#802

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consider eliding span.sync=false because it is the typical case #1881

consider eliding span.sync=false because it is the typical case #1881

trentm commented Nov 18, 2020 •

edited by zube bot

Loading

trentm commented Nov 18, 2020

smith commented Nov 18, 2020

trentm commented Nov 18, 2020

smith commented Nov 18, 2020 •

edited

Loading

trentm commented Nov 18, 2020

sorenlouv commented Nov 18, 2020

astorm commented Nov 18, 2020 •

edited

Loading

sorenlouv commented Nov 18, 2020

sorenlouv commented Nov 18, 2020 •

edited

Loading

sorenlouv commented Aug 23, 2021 •

edited

Loading

trentm commented Aug 23, 2021 •

edited

Loading

sorenlouv commented Sep 2, 2021 •

edited

Loading

trentm commented Sep 2, 2021

consider eliding span.sync=false because it is the typical case #1881

consider eliding span.sync=false because it is the typical case #1881

Comments

trentm commented Nov 18, 2020 • edited by zube bot Loading

trentm commented Nov 18, 2020

smith commented Nov 18, 2020

trentm commented Nov 18, 2020

smith commented Nov 18, 2020 • edited Loading

trentm commented Nov 18, 2020

sorenlouv commented Nov 18, 2020

astorm commented Nov 18, 2020 • edited Loading

sorenlouv commented Nov 18, 2020

sorenlouv commented Nov 18, 2020 • edited Loading

sorenlouv commented Aug 23, 2021 • edited Loading

trentm commented Aug 23, 2021 • edited Loading

sorenlouv commented Sep 2, 2021 • edited Loading

trentm commented Sep 2, 2021

trentm commented Nov 18, 2020 •

edited by zube bot

Loading

smith commented Nov 18, 2020 •

edited

Loading

astorm commented Nov 18, 2020 •

edited

Loading

sorenlouv commented Nov 18, 2020 •

edited

Loading

sorenlouv commented Aug 23, 2021 •

edited

Loading

trentm commented Aug 23, 2021 •

edited

Loading

sorenlouv commented Sep 2, 2021 •

edited

Loading