Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(processors.batch): Add batch processor #15869

Merged
merged 7 commits into from
Sep 30, 2024

Conversation

LarsStegman
Copy link
Contributor

@LarsStegman LarsStegman commented Sep 11, 2024

Summary

This new processor can distribute metrics across batches by adding a tag indicating what batch number it is in. This makes it possible to distribute the load of a high number of metrics across multiple instances of the same output plugin.

Checklist

  • No AI generated code was used in this PR

Related issues

resolves #15621
resolves #11707

@telegraf-tiger telegraf-tiger bot added feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin plugin/processor labels Sep 11, 2024
@LarsStegman
Copy link
Contributor Author

@srebhan the implementation is a little different than what you suggested. I did not see the added benefit of also specifying the batch size, since any overflow would probably just overflow into the next batch, which then also overflows. Please let me know if it should still be added.

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LarsStegman for the contribution! Just two small comments from my side. Furthermore, should we also add a force_rebatch option that will only overwrite the batch tag if it does not already exists? I'm asking because in the current default, Telegraf will run each processor twice, once before and once after aggregators if any.

plugins/processors/batch/batch.go Outdated Show resolved Hide resolved
plugins/processors/batch/README.md Outdated Show resolved Hide resolved
@srebhan srebhan self-assigned this Sep 11, 2024
LarsStegman and others added 2 commits September 11, 2024 13:29
Co-authored-by: Sven Rebhan <36194019+srebhan@users.noreply.github.com>
@LarsStegman
Copy link
Contributor Author

should we also add a force_rebatch option that will only overwrite the batch tag if it does not already exists? I'm asking because in the current default, Telegraf will run each processor twice, once before and once after aggregators if any.

Hmmm interesting. The results after the second pass will indeed be different, because the processor will already have run a pass and the count will have increased. I think it is better to add that feature indeed. It will be more predictable for users.

@LarsStegman
Copy link
Contributor Author

I made the rebatching enabled by default, because it is less computational load. By default it will now not check the existing tags.

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more comments. Regarding the flag, I'm fine either way but slightly tend to your approach...

plugins/processors/batch/sample.conf Outdated Show resolved Hide resolved
plugins/processors/batch/batch_test.go Outdated Show resolved Hide resolved
@srebhan srebhan changed the title feat(processors.batch): create batch processor feat(processors.batch): Add batch processor Sep 11, 2024
@LarsStegman

This comment was marked as outdated.

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LarsStegman awesome! Maybe just avoid abbreviations in config options? How about naming this just batches?

plugins/processors/batch/sample.conf Outdated Show resolved Hide resolved
@LarsStegman
Copy link
Contributor Author

@srebhan looks like the test runner timed out or something

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LarsStegman!

@telegraf-tiger
Copy link
Contributor

Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

🥳 This pull request decreases the Telegraf binary size by -8.01 % for linux amd64 (new size: 239.8 MB, nightly size 260.6 MB)

📦 Click here to get additional PR build artifacts

Artifact URLs

DEB RPM TAR GZ ZIP
amd64.deb aarch64.rpm darwin_amd64.tar.gz windows_amd64.zip
arm64.deb armel.rpm darwin_arm64.tar.gz windows_arm64.zip
armel.deb armv6hl.rpm freebsd_amd64.tar.gz windows_i386.zip
armhf.deb i386.rpm freebsd_armv7.tar.gz
i386.deb ppc64le.rpm freebsd_i386.tar.gz
mips.deb riscv64.rpm linux_amd64.tar.gz
mipsel.deb s390x.rpm linux_arm64.tar.gz
ppc64el.deb x86_64.rpm linux_armel.tar.gz
riscv64.deb linux_armhf.tar.gz
s390x.deb linux_i386.tar.gz
linux_mips.tar.gz
linux_mipsel.tar.gz
linux_ppc64le.tar.gz
linux_riscv64.tar.gz
linux_s390x.tar.gz

@srebhan srebhan added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Sep 13, 2024
@srebhan srebhan assigned DStrand1 and unassigned srebhan Sep 13, 2024
@DStrand1 DStrand1 merged commit 338282b into influxdata:master Sep 30, 2024
27 checks passed
@github-actions github-actions bot added this to the v1.33.0 milestone Sep 30, 2024
@knollet
Copy link
Contributor

knollet commented Oct 1, 2024

But, to parallelize processing, this doesn't help to create batch instances of, say, some regex processor, right? It only does the round-robing tagging.

Also: Does this help as efficiently as possible? Because with this, all the metrics still go down the same pipeline (and by this: go-channel) and don't split up into multiple batch processing pipelines. Every metric still has to be sorted out by a metric-/tagpass deciding "no this metric isn't batch-tagged for me" on every processor defined.

I have to say: I don't see this resolving any one of the claimed Feature-Requests.
This just casts a really bad duct-tape-solution made from starlark into an equally bad duct-tape-solution made from golang.

There shouldn't be batching-tags but

  1. the slow processors should be made parallely spawnable.
  2. The distribution into these parallely spawned processor instances should be realized by multiple go-channel-consumers.

@LarsStegman LarsStegman deleted the feat/processor-batch branch October 1, 2024 11:34
@LarsStegman
Copy link
Contributor Author

@knollet I agree, this processor does not fix #11707. @srebhan can that issue be reopened? I see you added it to this PR.

@knollet
Copy link
Contributor

knollet commented Oct 1, 2024

I mean, If I batch into, lets say, 2, I have to duplicate my complex code...

[[processors.batch]]
  batch_tag="batch"
  batches = 2
  
[[processors.regex]]
  tagpass = { "batch" = [ "0" ] }  # "0", not 0... srsly?
  ...some slow regex stuff...
  
# I have to replicate this here... exactly, no mistakes allowed, batches - 1 times, 
# else there's not even a resemblence of parallel processing. hard to catch bugs, 
# even if you generate this with a templating engine like jinja.
[[processors.regex]]
  tagpass = { "batch" = [ "1" ] }  # "1", not 1... srsly?
  ...some slow regex stuff...

@LarsStegman
Copy link
Contributor Author

LarsStegman commented Oct 1, 2024

@knollet this processor was not meant for increasing the efficiency of the processor pipeline, but to increase output capacity of the end of the pipeline. See #15621 (comment) and down.

@knollet
Copy link
Contributor

knollet commented Oct 1, 2024

Yeah, ok. I don't wanna trash talk your contribution.
Still: Don't you have the problem of having to duplicate that which your batch processor feeds into? May it be a processor or an output plugin doesn't really matter, does it?

@srebhan
Copy link
Member

srebhan commented Oct 1, 2024

@knollet I reopened #11707.

asaharn pushed a commit to asaharn/telegraf that referenced this pull request Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin new plugin plugin/processor ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
4 participants