SimpleActivityProcessor improvements. #896

CodeBlanch · 2020-07-23T07:02:24Z

Issue

Trying to fix a couple issues with the simple (hey it tries hard, give it a chance) processor:

The spec says:

Export() will never be called concurrently for the same exporter instance.
Export() can be called again only after the current call returns.

We're currently calling it concurrently, almost aggressively.

We throw every span at the thread pool. We'll steal a lot of threads the hosting process needs for its work doing that.

Design

A background thread is sleeping until it is told there is work. Once it is signaled it will tight-loop export spans until there is no more work. In that tight loop it will batch what it can. I'm thinking it is too expensive to export one-by-one when we know there is more data, even though the simple processor isn't supposed to technically batch.

Opened as a draft because I'm still working on tests, but I wanted to get feedback.

…mpliant.

cijothomas · 2020-07-23T07:12:29Z

src/OpenTelemetry/Trace/Export/SimpleActivityProcessor.cs

+
+            Interlocked.Increment(ref this.currentQueueSize);
+
+            this.activityQueue.Enqueue(activity);


I'd let @reyang comment here, but introducing a queue for simpleprocessor defeats the purpose it was intended, won't it?

Yeah, I was bringing this topic to the OpenTelemetry Specification SIG meeting 07/14/2020 as topic 8.

Have seen the challenge on .NET, C++ and other scenarios/languages that require high performance / concurrency. For example, an exporter that concurrently writes data to shared memory, ETW (event tracing for Windows) or LTTng.

This is something I need to work on from the spec perspective.

Open to ideas, but it feels unavoidable. How we had it before, starting Tasks, that's just queuing work for the thread pool, right?

My current thinking:

simple processor should run concurrently without contention.

exporter could have synchronization by default, following the spec (unless I changed the spec before GA).

we probably need a way for certain exporter to express that they are "thread free", similar like the COM STA/MTA model, so that the SDK won't try to synchronize the call.

I thought this one might be tricky. I agree with you guys, 100%, but the current version I don't think is passable. It is going to flood the thread pool under load. Can we move forward with this more safe design and pursue clarity with the spec for GA? Keep in mind, the simple processor is currently the default.

If an exporter is fast enough, this will export spans one-by-one as they are ready. Tight inner loop. It is only when the exporter is slow that we start feeding it chunks of data. Once our queue fills up, we start dropping data. It's a more safe approach all around IMO.

Satisfies the spec mandate that we shouldn't call export concurrently, but also more important mandates like we shouldn't starve the hosting process 🍽️

I'm worried that we move forward with this direction.
The Python SDK is actually using lock inside exporter, which means the exporter code is called concurrently.
The C++ SDK is also hitting the same concern and that's why I raised the question in the specification SIG.

Give me some time to do a quick experiment and I might come up with some solution. Currently I got lots of ideas:

[MethodImpl(MethodImplOptions.Synchronized)]

having both sync and async export interface and provide helper method to smooth it out

pushing the spec to make a change

move forward on this PR, and have a separate guidance "how to write high performance / concurrency exporter not using the exporter (but processor) interface" (which seems to be 😈)

Sure, looking forward to seeing what you come up with!

I kind of dig option 4. We provide a couple of OOB solutions geared towards best-effort, safety, and low-impact on the hosting process but if you want to make something ultra performant you can cut out the exporting layer completely and do it directly in the processor.

3 we should probably do regardless? At least get clarification on how it should work.

1 I'm skeptical that any kind of synchronization will be successful. I'm imagining a busy process creating a lof of spans very quickly with a slow exporter.

2 I thought about changing the interface when I was doing the work but it didn't really help the situation. The way it is written today (async) with the fire-and-forget we'll flood the pool. If it was sync, we'd block the SpanProcessor.EndActivity from finishing which holds up that thread. Neither case is really ideal.

I like this design and maybe we can port some of the work to the batching exporter. Using the handles to stop the thread constantly checking for work is nice.

I also agree with @cijothomas and @reyang that the simple processor is specced to not batch spans, the simple span exporter is effectively designed to be a simple queue that can be exhausted fairly easily. It is strange that the default will naturally lack performance and we should pursue amending the spec to default to a batching style exporter.

@reyang Ping 😄

The main thing I want to accomplish here is removing the fire-and-forget task/thread pool thing. That is soooo dangerous! In a high-volume/slow exporter situation, I'm pretty sure that will crash the process. If we want to remove the batching in the worker thread, no problem. It seems like a crime to not batch up the data when we know it's sitting there, but 🤷

@CodeBlanch this has a dependency on the refactor work, I think we should be ready to solve it on Wed.

CodeBlanch · 2020-08-14T05:28:42Z

Closing in favor of plan on #1078

Trying to improve the SimpleActivityProcessor while making it spec co…

5f81eb4

…mpliant.

cijothomas reviewed Jul 23, 2020

View reviewed changes

Merge branch 'master' into SimpleActivityProcessor

f656439

This was referenced Aug 10, 2020

Continue Refactoring TracerProvider. #1035

Merged

Update MyExporter example #1047

Merged

Refactor exporter - step 1 #1078

Merged

CodeBlanch closed this Aug 14, 2020

CodeBlanch deleted the SimpleActivityProcessor branch August 19, 2020 03:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SimpleActivityProcessor improvements. #896

SimpleActivityProcessor improvements. #896

CodeBlanch commented Jul 23, 2020

cijothomas Jul 23, 2020

reyang Jul 23, 2020

CodeBlanch Jul 23, 2020

reyang Jul 23, 2020

CodeBlanch Jul 23, 2020

reyang Jul 24, 2020 •

edited

Loading

CodeBlanch Jul 25, 2020

MikeGoldsmith Jul 27, 2020

CodeBlanch Aug 10, 2020 •

edited

Loading

reyang Aug 10, 2020

CodeBlanch commented Aug 14, 2020


		Interlocked.Increment(ref this.currentQueueSize);

		this.activityQueue.Enqueue(activity);

SimpleActivityProcessor improvements. #896

SimpleActivityProcessor improvements. #896

Conversation

CodeBlanch commented Jul 23, 2020

Issue

Design

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reyang Jul 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CodeBlanch Aug 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CodeBlanch commented Aug 14, 2020

reyang Jul 24, 2020 •

edited

Loading

CodeBlanch Aug 10, 2020 •

edited

Loading