Skip to content

Commit

Permalink
Initial checkin for type mapping removal transformer and experimentin…
Browse files Browse the repository at this point in the history
…g w/ Jinjava.

Major things to figure out: 1) how to deal with compositing parts of jinja templates (maybe it's just snippets vended as resources and a user writes their own top-level template) and 2) how to suppress reading any part of the json body when only the headers need to be transformed to prevent parsing the json/ndjson for every request through the system (probably minor though since most setups will probably be doing full-transforms on the vast majority of data flowing).

Signed-off-by: Greg Schohn <greg.schohn@gmail.com>
  • Loading branch information
gregschohn committed Nov 14, 2024
1 parent d7884e9 commit 21fb34b
Show file tree
Hide file tree
Showing 12 changed files with 532 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
This transformer converts routes for various requests (see below) to indices that used
[multi-type mappings](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping.html) (configured from ES 5.x
and earlier clusters) to work with newer versions of Elasticsearch and OpenSearch.

## Usage Prior to Elasticsearch 6

Let's start with a sample index definition (adapted from the Elasticsearch documentation) with two type mappings and
documents for each of them.
```
PUT activity
{
"mappings": {
"user": {
"properties": {
"name": { "type": "text" },
"user_name": { "type": "keyword" },
"email": { "type": "keyword" }
}
},
"post": {
"properties": {
"content": { "type": "text" },
"user_name": { "type": "keyword" },
"post_at": { "type": "date" }
}
}
}
}
PUT activity/user/someuser
{
"name": "Some User",
"user_name": "user",
"email": "user@example.com"
}
PUT activity/post/1
{
"user_name": "user",
"tweeted_at": "2024-11-13T00:00:00Z",
"content": "change is inevitable"
}
GET activity/post/_search
{
"query": {
"match": {
"user_name": "user"
}
}
}
```

## Routing data to new indices

The structure of the documents will need to change. Some options are to use separate indices, drop some of the types
to make an index single-purpose, or to create an index that's the union of all the types' fields.

With a simple mapping directive, we can define each of these three behaviors. The following yaml shows how to map
documents into two different indices named users and posts:
```
activity:
user: new_users
post: new_posts
```

To drop one, just leave it out:
```
activity:
user: only_users
```

To merge them together, use the same value:
```
activity:
user: any_activity
post: any_activity
```

Any indices that are NOT specified won't be modified - all additions, changes, and queries on those other indices not
specified at the root level will remain untouched. To remove ALL the activity for a given index, specify and empty
index at the top level
```
activity: {}
```

## Final Results

```
PUT any_activity
{
"mappings": {
"properties": {
"type": {
"type": "keyword"
},
"name": {
"type": "text"
},
"user_name": {
"type": "keyword"
},
"email": {
"type": "keyword"
},
"content": {
"type": "text"
},
"tweeted_at": {
"type": "date"
}
}
}
}
PUT any_activity/_doc/someuser
{
"name": "Some User",
"user_name": "user",
"email": "user@example.com"
}
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
plugins {
id 'io.freefair.lombok'
}

dependencies {
implementation project(':transformation:transformationPlugins:jsonMessageTransformers:jsonMessageTransformerInterface')

implementation group: 'com.hubspot.jinjava', name: 'jinjava', version: "2.7.3"

testImplementation project(':TrafficCapture:trafficReplayer')
testImplementation testFixtures(project(path: ':testHelperFixtures'))
testImplementation testFixtures(project(path: ':TrafficCapture:trafficReplayer'))

testImplementation group: 'com.google.guava', name: 'guava'
testImplementation group: 'org.junit.jupiter', name:'junit-jupiter-api'
testImplementation group: 'org.junit.jupiter', name:'junit-jupiter-params'
testImplementation group: 'org.slf4j', name: 'slf4j-api'
testRuntimeOnly group:'org.junit.jupiter', name:'junit-jupiter-engine'
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
package org.opensearch.migrations.transform;

import java.io.File;
import java.io.Reader;
import java.util.HashMap;
import java.util.Map;
import java.util.function.Function;

import com.fasterxml.jackson.databind.ObjectMapper;
import com.hubspot.jinjava.Jinjava;
import com.hubspot.jinjava.loader.FileLocator;
import lombok.SneakyThrows;

public class JinjavaTransformer implements IJsonTransformer {

protected final static ObjectMapper objectMapper = new ObjectMapper();
protected final String templateString;
protected final Jinjava jinjava;
protected final Function<Map<String, Object>, Map<String, Object>> wrapSourceAsContextConverter;

public JinjavaTransformer(String templateString,
Function<Map<String, Object>, Map<String, Object>> wrapSourceAsContextConverter) {
this(templateString, wrapSourceAsContextConverter, null);
}

public JinjavaTransformer(String templateString,
Function<Map<String, Object>, Map<String, Object>> wrapSourceAsContextConverter,
FileLocator fileLocator)
{
this.templateString = templateString;
this.wrapSourceAsContextConverter = wrapSourceAsContextConverter;
this.jinjava = new Jinjava();
this.jinjava.setResourceLocator(fileLocator);

}

@SneakyThrows
@Override
public Map<String, Object> transformJson(Map<String, Object> incomingJson) {
String resultStr = jinjava.render(templateString, wrapSourceAsContextConverter.apply(incomingJson));
return objectMapper.readValue(resultStr, Map.class);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
org.opensearch.migrations.transform.JinJavaTransformer
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
package org.opensearch.migrations.transform;

import java.util.Map;

import org.junit.jupiter.api.BeforeAll;
import org.junit.jupiter.api.Test;
import org.testcontainers.shaded.com.fasterxml.jackson.databind.ObjectMapper;

class JinjavaTransformerTest {

private final static String template = "" +
"{# First, parse the URI to check if it matches the pattern we want to transform #}\n" +
"{% set uri_parts = request.uri.split('/') %}\n" +
"{% set is_type_request = uri_parts | length == 2 %}\n" +
"{% set is_doc_request = uri_parts | length == 3 %}\n" +
"\n" +
"{# If this is a document request, check if we need to transform it based on mapping #}\n" +
"{% if is_doc_request and uri_parts[0] in index_mappings and uri_parts[1] in index_mappings[uri_parts[0]] %}\n" +
" {# This is a document request that needs transformation #}\n" +
" {\n" +
" \"verb\": \"{{ request.verb }}\",\n" +
" \"uri\": \"{{ index_mappings[uri_parts[0]][uri_parts[1]] }}/_doc/{{ uri_parts[2] }}\",\n" +
" \"body\": {{ request.body | tojson }}\n" +
" }\n" +
"{% elif is_type_request and uri_parts[0] in index_mappings %}\n" +
" {# This is an index creation request that needs transformation #}\n" +
" {\n" +
" \"verb\": \"{{ request.verb }}\",\n" +
" \"uri\": \"{{ index_mappings[uri_parts[0]][uri_parts[1]] }}\",\n" +
" \"body\": {\n" +
" \"mappings\": {\n" +
" \"properties\": {\n" +
" \"type\": {\n" +
" \"type\": \"keyword\"\n" +
" }\n" +
" {%- for type_name, type_props in request.body.mappings.items() %}\n" +
" {%- for prop_name, prop_def in type_props.properties.items() %}\n" +
" ,\n" +
" \"{{ prop_name }}\": {{ prop_def | tojson }}\n" +
" {%- endfor %}\n" +
" {%- endfor %}\n" +
" }\n" +
" }\n" +
" }\n" +
" }\n" +
"{% else %}\n" +
" {# Pass through any requests that don't match our transformation patterns #}\n" +
" {{ request | tojson }}\n" +
"{% endif %}";

private static JinjavaTransformer indexTypeMappingRewriter;
@BeforeAll
static void initialize() {
var indexMappings = Map.of(
"indexA", Map.of(
"type1", "indexA_1",
"type2", "indexA_2"),
"indexB", Map.of(
"type1", "indexB",
"type2", "indexB"),
"indexC", Map.of(
"type2", "indexC"));
indexTypeMappingRewriter = new JinjavaTransformer(template, request ->
Map.of("index_mappings", indexMappings,
"request", request));
}

@Test
public void test() throws Exception {
var testString =
"{\n" +
" \"verb\": \"PUT\",\n" +
" \"uri\": \"indexA/type2/someuser\",\n" +
" \"body\": {\n" +
" \"name\": \"Some User\",\n" +
" \"user_name\": \"user\",\n" +
" \"email\": \"user@example.com\"\n" +
" }\n" +
"}";
var objMapper = new ObjectMapper();
var resultObj = indexTypeMappingRewriter.transformJson(objMapper.readValue(testString, Map.class));
var resultStr = objMapper.writeValueAsString(resultObj);
System.out.println("resultStr = " + resultStr);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
package org.opensearch.migrations.transform;

import java.io.Reader;
import java.util.Map;

/**
* This is a simple interface to convert a JSON object (String, Map, or Array) into another
* JSON object. Any changes to datastructures, nesting, order, etc should be intentional.
*/
public interface ITextTransformer {
Reader transformJson(Reader incomingText);
}
Loading

0 comments on commit 21fb34b

Please sign in to comment.