Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using benthos as a pipeline processor in conduit #4

Closed
nickchomey opened this issue May 23, 2024 · 2 comments
Closed

Using benthos as a pipeline processor in conduit #4

nickchomey opened this issue May 23, 2024 · 2 comments

Comments

@nickchomey
Copy link

nickchomey commented May 23, 2024

I'm relatively new to these sorts of tools, but it seems to me that Conduit and Benthos are somewhat redundant as they are both stream processors. As such, it seems somewhat silly to use them together - better to choose one.

The main difference between them seems to be that Conduit is much more focused on CDC from data sources/stores/bases while Benthos is much more focused on the actual pipeline processing/transformation - it has many dozens of processors while Conduit has only a handful. Their processors also allow for enrichment via Sql queries, nats kv etc...

It seems to me that the universal/OpenCDC of Conduit is far more fundamental/important, since a pipeline ultimately needs to start from some data source, and should therefore be used as the main tool. But it would be a shame to not leverage the immense processing power of Benthos.

So, what I'm thinking is that rather than use Benthos as a Conduit Source/Destination, as was attempted in this repo, why not just embed it's pipeline processors into Conduit as a Standalone Processor? It could have some sort of Benthos bloblang mechanism for choosing the desired Benthos processors.

This woild allow Conduit to focus on its strength of CDC, while leveraging Benthos' strength in stream processing. You could, of course, always make other custom processors in Go or JavaScript to suit needs (or probably even use existing Benthos custom processors).

It's a topic that has been brought up various times in Benthos' Github and Discord, and they're generally responded to with the following links:

Apparently this can be used to embed Benthos into a golang app/binary
https://pkg.go.dev/github.com/benthosdev/benthos/v4/public/service#example-package-StreamBuilderConfig

One more example of that api here redpanda-data/connect#1727 (comment)

And here's a repo that apparently has relevant examples
https://github.com/benthosdev/benthos-plugin-example

I'm new to golang, but I think I'd like to try to figure this out in the next couple weeks as a way of getting my feet wet. If I can get some guidance on it, I see no reason why it couldn't be achieved.

Thoughts?

@lovromazgon
Copy link

Thanks for opening this, it's an idea worh discussing. It's just not the correct place, since it's about adding a processor to Conduit itself - do you mind moving this into a discussion on the main Conduit repo? 🙏

Quick link: https://github.com/ConduitIO/conduit/discussions/new?category=ideas

I'll take my time to respond there, but I'll say this much - it's not trivial to do this as a standalone processor, because of the WASI limitations.

Meanwhile I'll close this issue so all further comments are redirected to the correct place.

@github-project-automation github-project-automation bot moved this from Triage to Done in Conduit Main May 23, 2024
@nickchomey
Copy link
Author

Thanks for the prompt attention! Here's the new discussion ConduitIO/conduit#1614

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants