-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve flexibility over index.required_pipeline #49247
Comments
Pinging @elastic/es-core-features (:Core/Features/Ingest) |
One very important aspect of this added flexibility is specifically to let the user add their own pipelines around stack-provided pipelines without having to modify the existing pipeline. I've been feeling that need for a while, and one way I've been thinking about this is to provide "before hooks" or "after hooks", where users can insert their own pipelines anywhere they need. When users are forced to modify pipelines provided by products in the stack -- like Beats modules -- they're signing up to permanently having to re-apply their changes whenever they upgrade the product. Or worse, they won't remember, and whatever they improved will be lost when they upgrade. Approaching this in this in a generic fashion like before/after hooks would let users work around not only provided stack pipelines, but also around their own team structure & areas of responsibility. Consider this example:
With this in mind, I think it would be great to offer the ability to hook before/after via the API call, and via the index setting. |
Modifying a stack pipeline is possible but is nasty, as you can see here (scroll to "Ingest Node Pipeline"). |
Describe the feature:
Following up on #46847 there are a couple of cases where we want to ensure that a specific pipeline is run on any documents that are ingested into an index. For example, you may want to set the
event.ingested
timestamp or ensure that the name of the API Key used is present in the document.At the same time, we want to give users the flexibility they currently have to use a pipeline of their choosing to process the incoming data. We have
index.required_pipeline
, but it doesn't come with the flexibility we'd like.@skearns64 suggested:
If "append pipeline" means that Elasticsearch will automatically run the "append pipeline" on every indexed document after the pipeline specified with the request has been run, it sounds like the "append pipeline" option would solve the use-cases I'm familiar with.
I've not heard a compelling use case for "run first", but they could exist.
some questions that come to mind:
index.default_pipeline
andindex.required_pipeline
index.required_pipeline
has any use case thatindex.append_pipeline
does not solve, but that could be due to lack of context on my partcc @ruflin @webmat @clintongormley @jasontedor @bytebilly
(first Elasticsearch issue! 🎉 )
The text was updated successfully, but these errors were encountered: