Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed bug in mapping parsing; added tests #1023

Merged
merged 2 commits into from
Sep 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
public class TransformFunctions {
private static final ObjectMapper mapper = new ObjectMapper();
public static final String MAPPINGS_KEY_STR = "mappings";
public static final String PROPERTIES_KEY_STR = "properties";
public static final String SETTINGS_KEY_STR = "settings";
public static final String NUMBER_OF_REPLICAS_KEY_STR = "number_of_replicas";
public static final String INDEX_KEY_STR = "index";
Expand Down Expand Up @@ -75,17 +76,21 @@ public static void removeIntermediateMappingsLevels(ObjectNode root) {
* - [{"doc":{"properties":{"address":{"type":"text"}}}}]
* - [{"audit_message":{"properties":{"address":{"type":"text"}}}}]
*
* It's not impossible that the intermediate key is not present, in which case we should just extract the mappings:
* It's also possible for this list to be empty, in which case we should set the mappings to an empty node.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it appropriate to replace [] with "[{}]". Is this true for all versions of targets?

Copy link
Member Author

@chelma chelma Sep 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This results in an index creation call that looks like:

curl -X PUT "localhost:92011/index_name" -H "Content-Type: application/json" -d '
{
  "settings": {
        <the settings>
  },
  "mappings": {

  }
}'

I'm not sure if the empty mappings blob is valid for ALL possible targets, but it does appear valid for the targets we currently consider - OS 1.X & OS 2.X

*
* Finally, it may be possible that the intermediate key is not present, in which case we should just extract the mappings:
* - [{"properties":{"address":{"type":"text"}}}]
*/
if (val instanceof ArrayNode) {
ArrayNode mappingsList = (ArrayNode) val;
if (mappingsList.size() != 1) {
if (mappingsList.size() > 1) {
throw new IllegalArgumentException("Mappings list contains more than one member; this is unexpected: " + val.toString());
} else if (mappingsList.size() == 1) {
ObjectNode actualMappingsRoot = (ObjectNode) mappingsList.get(0);
root.set(MAPPINGS_KEY_STR, getMappingsFromBeneathIntermediate(actualMappingsRoot));
} else {
root.set(MAPPINGS_KEY_STR, mapper.createObjectNode());
}
ObjectNode actualMappingsRoot = (ObjectNode) mappingsList.get(0);

root.set(MAPPINGS_KEY_STR, getMappingsFromBeneathIntermediate(actualMappingsRoot));
}

/**
Expand Down Expand Up @@ -113,9 +118,9 @@ else if (val instanceof ObjectNode) {
* it regardless of what it is named.
*/
public static ObjectNode getMappingsFromBeneathIntermediate(ObjectNode mappingsRoot) {
if (mappingsRoot.has("properties")) {
if (mappingsRoot.has(PROPERTIES_KEY_STR)) {
return mappingsRoot;
} else if (!mappingsRoot.has("properties")) {
} else if (!mappingsRoot.has(PROPERTIES_KEY_STR)) {
return (ObjectNode) mappingsRoot.get(mappingsRoot.fieldNames().next()).deepCopy();
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
package org.opensearch.migrations.bulkload.transformers;

import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.node.ArrayNode;
import com.fasterxml.jackson.databind.node.ObjectNode;
import org.junit.jupiter.api.Test;

import lombok.extern.slf4j.Slf4j;

import static org.junit.jupiter.api.Assertions.assertEquals;

@Slf4j
public class TransformFunctionsTest {

@Test
public void removeIntermediateMappingsLevels_AsExpected() throws Exception {
// Extract from {"mappings":[{"_doc":{"properties":{"address":{"type":"text"}}}}])
ObjectNode testNode1 = new ObjectMapper().createObjectNode();
ArrayNode mappingsNode1 = new ObjectMapper().createArrayNode();
ObjectNode docNode1 = new ObjectMapper().createObjectNode();
ObjectNode propertiesNode1 = new ObjectMapper().createObjectNode();
ObjectNode addressNode1 = new ObjectMapper().createObjectNode();
addressNode1.put("type", "text");
propertiesNode1.set("address", addressNode1);
docNode1.set(TransformFunctions.PROPERTIES_KEY_STR, propertiesNode1);
ObjectNode intermediateNode1 = new ObjectMapper().createObjectNode();
intermediateNode1.set("_doc", docNode1);
mappingsNode1.add(intermediateNode1);
Comment on lines +18 to +28
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you replace this w/ inline json that gets parsed?

  var json = 
      "{" +
      "  \"mappings\": [{" +
      "    \"_doc\": {" +
      ...

It's a lot easier to read and to maintain. You can copy-paste literal json from a text editor into your IDE and it should do all of the necessary escaping (IntelliJ does this, I'd hope eclipse & vscode do too).

Then you can just new ObjectMapper().readTree(json) and get the parsed JsonNode back.

testNode1.set(TransformFunctions.MAPPINGS_KEY_STR, mappingsNode1);

TransformFunctions.removeIntermediateMappingsLevels(testNode1);
assertEquals(docNode1.toString(), testNode1.get(TransformFunctions.MAPPINGS_KEY_STR).toString());

// Extract from {"mappings":[{"properties":{"address":{"type":"text"}}}])
ObjectNode testNode2 = new ObjectMapper().createObjectNode();
ArrayNode mappingsNode2 = new ObjectMapper().createArrayNode();
ObjectNode propertiesNode2 = new ObjectMapper().createObjectNode();
ObjectNode addressNode2 = new ObjectMapper().createObjectNode();
addressNode2.put("type", "text");
propertiesNode2.set("address", addressNode2);
ObjectNode intermediateNode2 = new ObjectMapper().createObjectNode();
intermediateNode2.set(TransformFunctions.PROPERTIES_KEY_STR, propertiesNode2);
mappingsNode2.add(intermediateNode2);
testNode2.set(TransformFunctions.MAPPINGS_KEY_STR, mappingsNode2);

TransformFunctions.removeIntermediateMappingsLevels(testNode2);
assertEquals(intermediateNode2.toString(), testNode2.get(TransformFunctions.MAPPINGS_KEY_STR).toString());

// Extract from {"mappings":[])
ObjectNode testNode3 = new ObjectMapper().createObjectNode();
ArrayNode mappingsNode3 = new ObjectMapper().createArrayNode();
testNode3.set(TransformFunctions.MAPPINGS_KEY_STR, mappingsNode3);

TransformFunctions.removeIntermediateMappingsLevels(testNode3);
assertEquals(new ObjectMapper().createObjectNode().toString(), testNode3.get(TransformFunctions.MAPPINGS_KEY_STR).toString());
}

@Test
public void getMappingsFromBeneathIntermediate_AsExpected() throws Exception {
// Extract from {"_doc":{"properties":{"address":{"type":"text"}}}}
ObjectNode testNode1 = new ObjectMapper().createObjectNode();
ObjectNode docNode1 = new ObjectMapper().createObjectNode();
ObjectNode propertiesNode1 = new ObjectMapper().createObjectNode();
ObjectNode addressNode1 = new ObjectMapper().createObjectNode();
addressNode1.put("type", "text");
propertiesNode1.set("address", addressNode1);
docNode1.set(TransformFunctions.PROPERTIES_KEY_STR, propertiesNode1);
testNode1.set("_doc", docNode1);

ObjectNode result1 = TransformFunctions.getMappingsFromBeneathIntermediate(testNode1);
assertEquals(docNode1.toString(), result1.toString());

// Extract from {"arbitrary_type":{"properties":{"address":{"type":"text"}}}}
ObjectNode testNode2 = new ObjectMapper().createObjectNode();
ObjectNode docNode2 = new ObjectMapper().createObjectNode();
ObjectNode propertiesNode2 = new ObjectMapper().createObjectNode();
ObjectNode addressNode2 = new ObjectMapper().createObjectNode();
addressNode2.put("type", "text");
propertiesNode2.set("address", addressNode2);
docNode2.set(TransformFunctions.PROPERTIES_KEY_STR, propertiesNode2);
testNode2.set("arbitrary_type", docNode2);

ObjectNode result2 = TransformFunctions.getMappingsFromBeneathIntermediate(testNode2);
assertEquals(docNode2.toString(), result2.toString());

// Extract from {"properties":{"address":{"type":"text"}}
ObjectNode testNode3 = new ObjectMapper().createObjectNode();
ObjectNode propertiesNode3 = new ObjectMapper().createObjectNode();
ObjectNode addressNode3 = new ObjectMapper().createObjectNode();
addressNode3.put("type", "text");
propertiesNode3.set("address", addressNode3);
testNode3.set(TransformFunctions.PROPERTIES_KEY_STR, propertiesNode3);

ObjectNode result3 = TransformFunctions.getMappingsFromBeneathIntermediate(testNode3);
assertEquals(testNode3.toString(), result3.toString());
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,22 @@ public void transformIndexMetadata_AsExpected() throws Exception {
IndexMetadata transformedIndexFwc = transformer.transformIndexMetadata(indexMetadataFwc);
IndexMetadataData_OS_2_11 finalIndexFwc =new IndexMetadataData_OS_2_11(transformedIndexFwc.getRawJson(), transformedIndexFwc.getId(), transformedIndexFwc.getName());

String expectedIndexBwc = "{\"version\":3,\"mapping_version\":1,\"settings_version\":1,\"aliases_version\":1,\"routing_num_shards\":1024,\"state\":\"open\",\"settings\":{\"creation_date\":\"1727285787498\",\"number_of_replicas\":1,\"number_of_shards\":\"1\",\"provided_name\":\"bwc_index_1\",\"uuid\":\"P4PDS4fFSECIprHxpKSoiQ\",\"version\":{\"created\":\"7100299\"}},\"mappings\":{\"properties\":{\"content\":{\"type\":\"text\"},\"title\":{\"type\":\"text\"}}},\"aliases\":{\"bwc_alias\":{}},\"primary_terms\":[1],\"in_sync_allocations\":{\"0\":[\"0M-gMXkNQFC02lnWJN4BVQ\"]},\"rollover_info\":{},\"system\":false}";
String expectedIndexFwc = "{\"version\":3,\"mapping_version\":1,\"settings_version\":1,\"aliases_version\":1,\"routing_num_shards\":1024,\"state\":\"open\",\"settings\":{\"creation_date\":\"1727285787822\",\"number_of_replicas\":1,\"number_of_shards\":\"1\",\"provided_name\":\"fwc_index_1\",\"uuid\":\"riC1gLlrR2eh_uAh9Zdpiw\",\"version\":{\"created\":\"7100299\"}},\"mappings\":{\"properties\":{\"content\":{\"type\":\"text\"},\"title\":{\"type\":\"text\"}}},\"aliases\":{\"fwc_alias\":{}},\"primary_terms\":[1],\"in_sync_allocations\":{\"0\":[\"8sKrVCEaSxy4yP3cx8kcYQ\"]},\"rollover_info\":{},\"system\":false}";
IndexMetadata indexMetadataNoMappingNoDocs = sourceResourceProvider.getIndexMetadata().fromRepo(snapshot.name, "no_mappings_no_docs");
IndexMetadata transformedIndexNoMappingNoDocs = transformer.transformIndexMetadata(indexMetadataNoMappingNoDocs);
IndexMetadataData_OS_2_11 finalIndexNoMappingNoDocs = new IndexMetadataData_OS_2_11(transformedIndexNoMappingNoDocs.getRawJson(), transformedIndexNoMappingNoDocs.getId(), transformedIndexNoMappingNoDocs.getName());

IndexMetadata indexMetadataEmptyMappingNoDocs = sourceResourceProvider.getIndexMetadata().fromRepo(snapshot.name, "empty_mappings_no_docs");
IndexMetadata transformedIndexEmptyMappingNoDocs = transformer.transformIndexMetadata(indexMetadataEmptyMappingNoDocs);
IndexMetadataData_OS_2_11 finalIndexEmptyMappingNoDocs = new IndexMetadataData_OS_2_11(transformedIndexEmptyMappingNoDocs.getRawJson(), transformedIndexEmptyMappingNoDocs.getId(), transformedIndexEmptyMappingNoDocs.getName());

String expectedIndexBwc = "{\"version\":3,\"mapping_version\":1,\"settings_version\":1,\"aliases_version\":1,\"routing_num_shards\":1024,\"state\":\"open\",\"settings\":{\"creation_date\":\"1727459371883\",\"number_of_replicas\":1,\"number_of_shards\":\"1\",\"provided_name\":\"bwc_index_1\",\"uuid\":\"tBmFXxGhTeiDlznQiKfNCA\",\"version\":{\"created\":\"7100299\"}},\"mappings\":{\"properties\":{\"content\":{\"type\":\"text\"},\"title\":{\"type\":\"text\"}}},\"aliases\":{\"bwc_alias\":{}},\"primary_terms\":[1],\"in_sync_allocations\":{\"0\":[\"jSYEePXYTka3EJ0vvdPGJA\"]},\"rollover_info\":{},\"system\":false}";
String expectedIndexFwc = "{\"version\":3,\"mapping_version\":1,\"settings_version\":1,\"aliases_version\":1,\"routing_num_shards\":1024,\"state\":\"open\",\"settings\":{\"creation_date\":\"1727459372123\",\"number_of_replicas\":1,\"number_of_shards\":\"1\",\"provided_name\":\"fwc_index_1\",\"uuid\":\"BI3xMNgnRe6HldftNpfxzQ\",\"version\":{\"created\":\"7100299\"}},\"mappings\":{\"properties\":{\"content\":{\"type\":\"text\"},\"title\":{\"type\":\"text\"}}},\"aliases\":{\"fwc_alias\":{}},\"primary_terms\":[1],\"in_sync_allocations\":{\"0\":[\"SvHX-4xPTZWt0sYMTiFGYw\"]},\"rollover_info\":{},\"system\":false}";
String expectedIndexNoMappingNoDocs = "{\"version\":3,\"mapping_version\":1,\"settings_version\":1,\"aliases_version\":1,\"routing_num_shards\":1024,\"state\":\"open\",\"settings\":{\"creation_date\":\"1727459372205\",\"number_of_replicas\":1,\"number_of_shards\":\"1\",\"provided_name\":\"no_mappings_no_docs\",\"uuid\":\"S0lxKX7vSLS7jzquAnNzdg\",\"version\":{\"created\":\"7100299\"}},\"mappings\":{},\"aliases\":{},\"primary_terms\":[1],\"in_sync_allocations\":{\"0\":[\"McFl7a21QZKeB7HLHo6eMQ\"]},\"rollover_info\":{},\"system\":false}";
String expectedIndexEmptyMappingNoDocs = "{\"version\":3,\"mapping_version\":1,\"settings_version\":1,\"aliases_version\":1,\"routing_num_shards\":1024,\"state\":\"open\",\"settings\":{\"creation_date\":\"1727459372264\",\"number_of_replicas\":1,\"number_of_shards\":\"1\",\"provided_name\":\"empty_mappings_no_docs\",\"uuid\":\"ndSIWVDSQpuC8xY2DieX7g\",\"version\":{\"created\":\"7100299\"}},\"mappings\":{},\"aliases\":{},\"primary_terms\":[1],\"in_sync_allocations\":{\"0\":[\"yr1xkacYT42MjPGjAzQhHg\"]},\"rollover_info\":{},\"system\":false}";

assertEquals(expectedIndexBwc, finalIndexBwc.getRawJson().toString());
assertEquals(expectedIndexFwc, finalIndexFwc.getRawJson().toString());
assertEquals(expectedIndexNoMappingNoDocs, finalIndexNoMappingNoDocs.getRawJson().toString());
assertEquals(expectedIndexEmptyMappingNoDocs, finalIndexEmptyMappingNoDocs.getRawJson().toString());
}
}
31 changes: 27 additions & 4 deletions RFS/test-resources/inventory.md
Original file line number Diff line number Diff line change
Expand Up @@ -345,7 +345,7 @@ curl -X PUT "localhost:19200/test_updates_deletes" -H "Content-Type: application
```

#### ES_7_10_BWC_Check
An Elastic 7.10 snapshot repo designed to exercise backwards compatibility across a couple different features. It contains two indices; one is created using pre-ES 7.8 templates and pre-ES 7.0 type declarations, and the other is created using forward-compatible index/component templates and no type declarations. Both indices contain a single document.
An Elastic 7.10 snapshot repo designed to exercise backwards compatibility across a couple different features. It contains two indices; one is created using pre-ES 7.8 templates and pre-ES 7.0 type declarations, and the other is created using forward-compatible index/component templates and no type declarations. Both indices contain a single document. It also contains two indices with either no mappings or an empty mapping entry and no documents; this exercises an edge case where there will be a mappings entry in the snapshot with no contents.

```
curl -X PUT "localhost:19200/_template/bwc_template?include_type_name=true" -H "Content-Type: application/json" -d '
Expand Down Expand Up @@ -426,12 +426,36 @@ curl -X PUT "localhost:19200/_index_template/fwc_template" -H "Content-Type: app

curl -X PUT "localhost:19200/fwc_index_1" -H "Content-Type: application/json"

curl -X PUT "localhost:19200/fwc_alias/_doc/bwc_doc" -H "Content-Type: application/json" -d '
curl -X PUT "localhost:19200/fwc_alias/_doc/fwc_doc" -H "Content-Type: application/json" -d '
{
"title": "This is a doc in a forward compatible index",
"content": "Life, the Universe, and Everything"
}'

curl -X PUT "localhost:19200/no_mappings_no_docs" -H "Content-Type: application/json" -d '
{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 0
}
}
}'

curl -X PUT "localhost:19200/empty_mappings_no_docs" -H "Content-Type: application/json" -d '
{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 0
}
},
"mappings": {
"properties":{
}
}
}'
Comment on lines +435 to +457
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to see us generate these on-the-fly and then cache them like we do for preloaded data images. In fact, use the preloaded images, then just take a snapshot via a helper program. Make the output of that something that gets cached as a build artifact and then it can become a dependency to the tests. It seems like that could be 1-2 hours of work and removes the risks of many hours of work trying to figure out if there was a mismatch later on.


curl -X PUT "localhost:19200/_snapshot/test_repository" -H "Content-Type: application/json" -d '{
"type": "fs",
"settings": {
Expand All @@ -441,9 +465,8 @@ curl -X PUT "localhost:19200/_snapshot/test_repository" -H "Content-Type: applic
}'

curl -X PUT "localhost:19200/_snapshot/test_repository/rfs-snapshot" -H "Content-Type: application/json" -d '{
"indices": "bwc_index_1,fwc_index_1",
"indices": "bwc_index_1,fwc_index_1,no_mappings_no_docs,empty_mappings_no_docs",
"ignore_unavailable": true,
"include_global_state": true
}'

```
2 changes: 1 addition & 1 deletion RFS/test-resources/snapshots/ES_7_10_BWC_Check/index-0
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"snapshots":[{"name":"rfs-snapshot","uuid":"rNzQqxl5Qrm99ZzB44Yz0w","state":1,"index_metadata_lookup":{"rtO3hIcQREynKwRVke75sQ":"P4PDS4fFSECIprHxpKSoiQ-_na_-1-1-1","6f5_qLyfQu-MKqe5IEJGuQ":"riC1gLlrR2eh_uAh9Zdpiw-_na_-1-1-1"},"version":"7.10.2"}],"indices":{"bwc_index_1":{"id":"rtO3hIcQREynKwRVke75sQ","snapshots":["rNzQqxl5Qrm99ZzB44Yz0w"],"shard_generations":["OSYnb8w3SKmTqQ_p7m84JA"]},"fwc_index_1":{"id":"6f5_qLyfQu-MKqe5IEJGuQ","snapshots":["rNzQqxl5Qrm99ZzB44Yz0w"],"shard_generations":["73ZKrS4iTc6nKzY0E4kC1w"]}},"min_version":"7.9.0","index_metadata_identifiers":{"riC1gLlrR2eh_uAh9Zdpiw-_na_-1-1-1":"AzJBKpIBKBXpMe1tfd-g","P4PDS4fFSECIprHxpKSoiQ-_na_-1-1-1":"AjJBKpIBKBXpMe1tfd-g"}}
{"snapshots":[{"name":"rfs-snapshot","uuid":"KhcpVj8aRMek0oLMUSPHeg","state":1,"index_metadata_lookup":{"zUFSBaZgSdKCPAS2xEfCww":"ndSIWVDSQpuC8xY2DieX7g-_na_-1-1-1","s_FgCb4gREmddVPRW-EtWw":"BI3xMNgnRe6HldftNpfxzQ-_na_-1-1-1","a5ONDmS2RmmN1GEZMRCJRg":"S0lxKX7vSLS7jzquAnNzdg-_na_-1-1-1","0edrmuSPR1CIr2B6BZbMJA":"tBmFXxGhTeiDlznQiKfNCA-_na_-1-1-1"},"version":"7.10.2"}],"indices":{"empty_mappings_no_docs":{"id":"zUFSBaZgSdKCPAS2xEfCww","snapshots":["KhcpVj8aRMek0oLMUSPHeg"],"shard_generations":["C7dK_wyqSUuDyKkefOXUUQ"]},"bwc_index_1":{"id":"0edrmuSPR1CIr2B6BZbMJA","snapshots":["KhcpVj8aRMek0oLMUSPHeg"],"shard_generations":["tGRrEPUKTuWK_eyviXSHvg"]},"fwc_index_1":{"id":"s_FgCb4gREmddVPRW-EtWw","snapshots":["KhcpVj8aRMek0oLMUSPHeg"],"shard_generations":["yaxW_gAlTNqirWlIIoil8A"]},"no_mappings_no_docs":{"id":"a5ONDmS2RmmN1GEZMRCJRg","snapshots":["KhcpVj8aRMek0oLMUSPHeg"],"shard_generations":["cbs2bGwwSy66JeQOO9vNbQ"]}},"min_version":"7.9.0","index_metadata_identifiers":{"ndSIWVDSQpuC8xY2DieX7g-_na_-1-1-1":"3liaNJIBxXZFvc_xLY-F","S0lxKX7vSLS7jzquAnNzdg-_na_-1-1-1":"31iaNJIBxXZFvc_xLY-F","BI3xMNgnRe6HldftNpfxzQ-_na_-1-1-1":"3ViaNJIBxXZFvc_xLY-F","tBmFXxGhTeiDlznQiKfNCA-_na_-1-1-1":"4FiaNJIBxXZFvc_xLY-F"}}
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.