[pkg/stanza] Make Stanza adapter more more synchronous by removing channels and workers #35453

andrzej-stencel · 2024-09-27T10:26:15Z

Component(s)

pkg/stanza

Is your feature request related to a problem? Please describe.

This issue is created as a result of discussion in #31074. From #31074 (comment):

In my view, the current implementation of the Stanza adapter with multiple channels and asynchronous workers if overly complex without delivering a lot of value, instead creating the possibility of data loss.

and:

The conversion [from Stanza entries to OTLP log records] is currently pretty complex, involving multiple asynchronous workers and a bunch of channels.

Describe the solution you'd like

Remove the channels and workers by having LogEmitter convert Stanza entries to log records and call ConsumeLogs synchronously. Leave the 100-log buffering in place for this change, to minimize the impact. Compare benchmarks.

Describe alternatives you've considered

Leaving the 100-log buffering in place leaves the issue of losing logs still not completely resolved. Ideally we want to get rid of this buffering, but we need to step cautiously, as this may have serious performance impact. We should measure this impact and possibly explore possibilities for Stanza receivers like Filelog receiver / File consumer to emit entries in batches, not one by one. Implementing this would alleviate the possible performance impact that removing of batching in LogEmitter may introduce.

Additional context

See #31074

The text was updated successfully, but these errors were encountered:

github-actions · 2024-09-27T10:26:34Z

Pinging code owners:

pkg/stanza: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

andrzej-stencel · 2024-10-01T12:29:39Z

Removing needs triage label as this was discussed with the code owner.

bacherfl · 2024-10-07T06:19:51Z

Hi! I would like to take this on if this is not taken yet

andrzej-stencel · 2024-10-07T09:16:54Z

It's yours @bacherfl, thanks for picking it up!

bacherfl · 2024-10-08T05:10:51Z

Thanks @andrzej-stencel! One quick question to clarify: Currently the LogEmitter and its logChannel are also used by two other components:

Should we remove the async channels from the LogEmitter altogether and also adapt those two other components that use it, or do you think we should provide both options (async and sync) in the LogEmitter and leave those two other components unchanged for now?
Both of these other components would be fairly simple to adapt to also work synchronously though.

andrzej-stencel · 2024-10-09T08:02:19Z

Good point @bacherfl. I wasn't aware that the container parser uses LogEmitter internally. Yes, ideally the logs channel should be removed from the LogEmitter altogether and both the container parser and the Logs Transform processor should use it the same way as the Stanza adapter, having the LogEmitter emit logs (almost) synchronously.

I put "almost" in the sentence above, because we want to keep using the 100-entry buffer in the LogEmitter, which makes it not completely synchronous after removing the channels and workers.

bacherfl · 2024-10-09T09:05:25Z

Good point @bacherfl. I wasn't aware that the container parser uses LogEmitter internally. Yes, ideally the logs channel should be removed from the LogEmitter altogether and both the container parser and the Logs Transform processor should use it the same way as the Stanza adapter, having the LogEmitter emit logs (almost) synchronously.

I put "almost" in the sentence above, because we want to keep using the 100-entry buffer in the LogEmitter, which makes it not completely synchronous after removing the channels and workers.

Thanks for the clarification @andrzej-stencel - then I will proceed with also adapting the log transform processor and the container parser. I also saw that you were doing some performance evaluations in #35454, so I will also use that method to compare the changes I make and update the PR once I have the results

bacherfl · 2024-10-10T08:54:37Z

alright, I think the PR is ready for a first round of reviews - I have added the performance test results to the PR description. CC @andrzej-stencel

andrzej-stencel added enhancement New feature or request needs triage New item requiring triage labels Sep 27, 2024

github-actions bot added the pkg/stanza label Sep 27, 2024

andrzej-stencel mentioned this issue Sep 27, 2024

Filelog receiver loses logs even if persistence is set up #31074

Open

github-actions bot mentioned this issue Oct 1, 2024

Weekly Report: 2024-09-24 - 2024-10-01 #35498

Closed

andrzej-stencel removed the needs triage New item requiring triage label Oct 1, 2024

andrzej-stencel changed the title ~~[pkg/stanza] Make Stanza adapter synchronuous~~ [pkg/stanza] Make Stanza adapter synchronous Oct 4, 2024

andrzej-stencel assigned bacherfl Oct 7, 2024

bacherfl linked a pull request Oct 8, 2024 that will close this issue

[pkg/stanza] make log emitter and entry conversion in adapter synchronous #35669

Open

andrzej-stencel changed the title ~~[pkg/stanza] Make Stanza adapter synchronous~~ [pkg/stanza] Make Stanza adapter more more synchronous by removing channels and workers Oct 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pkg/stanza] Make Stanza adapter more more synchronous by removing channels and workers #35453

[pkg/stanza] Make Stanza adapter more more synchronous by removing channels and workers #35453

andrzej-stencel commented Sep 27, 2024 •

edited

Loading

github-actions bot commented Sep 27, 2024

andrzej-stencel commented Oct 1, 2024

bacherfl commented Oct 7, 2024

andrzej-stencel commented Oct 7, 2024

bacherfl commented Oct 8, 2024

andrzej-stencel commented Oct 9, 2024

bacherfl commented Oct 9, 2024

bacherfl commented Oct 10, 2024

[pkg/stanza] Make Stanza adapter more more synchronous by removing channels and workers #35453

[pkg/stanza] Make Stanza adapter more more synchronous by removing channels and workers #35453

Comments

andrzej-stencel commented Sep 27, 2024 • edited Loading

Component(s)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

github-actions bot commented Sep 27, 2024

andrzej-stencel commented Oct 1, 2024

bacherfl commented Oct 7, 2024

andrzej-stencel commented Oct 7, 2024

bacherfl commented Oct 8, 2024

andrzej-stencel commented Oct 9, 2024

bacherfl commented Oct 9, 2024

bacherfl commented Oct 10, 2024

andrzej-stencel commented Sep 27, 2024 •

edited

Loading