Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add remove_by_pattern ingest processor #6295

Merged
merged 12 commits into from
Jan 31, 2024
133 changes: 133 additions & 0 deletions _ingest-pipelines/processors/remove_by_pattern.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
---
layout: default
title: Remove_by_pattern
parent: Ingest processors
nav_order: 225
redirect_from:
- /api-reference/ingest-apis/processors/remove_by_pattern/
---

# Remove processor

The `remove_by_pattern` processor is used to remove the root level fields from a document by the specified wildcard patterns.
gaobinlong marked this conversation as resolved.
Show resolved Hide resolved

## Syntax

The following is the syntax for the `remove_by_pattern` processor:

```json
{
"remove_by_pattern": {
"field_pattern": "field_name_prefix*"
}
}
```
{% include copy-curl.html %}

## Configuration parameters

The following table lists the required and optional parameters for the `remove_by_pattern` processor.

| Parameter | Required/Optional | Description |
|---|---|---|
`field_pattern` | Optional | The fields match this pattern will be removed. All of the metadata fields like `_index`, `_version`, `_version_type`, and `_id` are ignored if they match the pattern. This option only supports the root level fields in the document. |
gaobinlong marked this conversation as resolved.
Show resolved Hide resolved
`exclude_field_pattern` | Optional | The fields do not match this pattern will be removed. All of the metadata fields like `_index`, `_version`, `_version_type`, and `_id` are ignored if they don't match the pattern. This option only supports the root level fields in the document. The `field_pattern` and `exclude_field_pattern` options are mutually exclusive. |
gaobinlong marked this conversation as resolved.
Show resolved Hide resolved
`description` | Optional | A brief description of the processor. |
`if` | Optional | A condition for running the processor. |
`ignore_failure` | Optional | Specifies whether the processor continues execution even if it encounters errors. If set to `true`, failures are ignored. Default is `false`. |
gaobinlong marked this conversation as resolved.
Show resolved Hide resolved
`on_failure` | Optional | A list of processors to run if the processor fails. |
`tag` | Optional | An identifier tag for the processor. Useful for debugging in order to distinguish between processors of the same type. |

## Using the processor

Follow these steps to use the processor in a pipeline.

**Step 1: Create a pipeline**

The following query creates a pipeline, named `remove_fields_by_pattern`, that removes the fields which match the pattern `foo*` from a document:
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

```json
PUT /_ingest/pipeline/remove_fields_by_pattern
{
"description": "Pipeline that removes the fields by patterns.",
"processors": [
{
"remove_by_pattern": {
"field_pattern": "foo*"
}
}
]
}
```
{% include copy-curl.html %}

**Step 2 (Optional): Test the pipeline**

It is recommended that you test your pipeline before you ingest documents.
{: .tip}

To test the pipeline, run the following query:

```json
POST _ingest/pipeline/remove_fields_by_pattern/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source":{
"foo1": "foo1",
"foo2": "foo2",
"bar": "bar"
}
}
]
}
```
{% include copy-curl.html %}

**Response**

The following example response confirms that the pipeline is working as expected:

```json
{
"docs": [
{
"doc": {
"_index": "testindex1",
"_id": "1",
"_source": {
"bar": "bar"
},
"_ingest": {
"timestamp": "2023-08-24T18:02:13.218986756Z"
}
}
}
]
}
```

**Step 3: Ingest a document**

The following query ingests a document into an index named `testindex1`:

```json
PPUT testindex1/_doc/1?pipeline=remove_fields_by_pattern
{
"foo1": "foo1",
"foo2": "foo2",
"bar": "bar"
}
```
{% include copy-curl.html %}

**Step 4 (Optional): Retrieve the document**

To retrieve the document, run the following query:

```json
GET testindex1/_doc/1
```
{% include copy-curl.html %}
Loading