Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed bug in mapping parsing; added tests #1023

Merged
merged 2 commits into from
Sep 27, 2024

Conversation

chelma
Copy link
Member

@chelma chelma commented Sep 27, 2024

Description

  • Fixed a bug reported by a Cx; their mapping list in the snapshot was empty for a specific index, which broke our parsing code. The only way this can occur (AFAIK) is if they have an index with no mappings and no documents added to it. As soon as you add a document Elasticsearch adds a mapping to the index.

Issues Resolved

  • N/A

Testing

  • Added unit tests, including a new snapshot to exercise the code path
  • Manually ran the MM tool against a local dockerized Target Cluster using that new test snapshot:
Command line arguments: migrate --snapshot-name rfs-snapshot --file-system-repo-path /Users/chelma/workspace/migration-assistant/opensearch-migrations/RFS/test-resources/snapshots/ES_7_10_BWC_Check --target-host http://localhost:29200 --source-version ES_7_10 --min-replicas 1 --component-template-allowlist fwc_settings,fwc_mappings --index-template-allowlist fwc_template

Starting Metadata Migration
Clusters:
   Source:
      Snapshot: ELASTICSEARCH 7.10.0 FileSystemRepo(repoRootDir=/Users/chelma/workspace/migration-assistant/opensearch-migrations/RFS/test-resources/snapshots/ES_7_10_BWC_Check)

   Target:
      Remote Cluster: OPENSEARCH 2.11.1 ConnectionContext(uri=http://localhost:29200, protocol=HTTP, insecure=false, compressionSupported=false)


Migrated Items:
   Index Templates:
      fwc_template

   Component Templates:
      fwc_mappings, fwc_settings

   Indexes:
      bwc_index_1, empty_mappings_no_docs, fwc_index_1, no_mappings_no_docs

   Aliases:
      bwc_alias, fwc_alias


Results:
   0 issue(s) detected

Check List

  • New functionality includes testing
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Chris Helma <chelma+github@amazon.com>
Copy link

codecov bot commented Sep 27, 2024

Codecov Report

Attention: Patch coverage is 75.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 80.19%. Comparing base (03dfed0) to head (97ca601).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
...ions/bulkload/transformers/TransformFunctions.java 75.00% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1023      +/-   ##
============================================
- Coverage     80.20%   80.19%   -0.02%     
  Complexity     2722     2722              
============================================
  Files           365      365              
  Lines         13603    13606       +3     
  Branches        941      942       +1     
============================================
+ Hits          10910    10911       +1     
- Misses         2118     2119       +1     
- Partials        575      576       +1     
Flag Coverage Δ
gradle-test 78.20% <75.00%> (-0.02%) ⬇️
python-test 89.91% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Chris Helma <chelma+github@amazon.com>
@AndreKurait AndreKurait merged commit d0d78b0 into opensearch-project:main Sep 27, 2024
13 of 14 checks passed
@@ -75,17 +76,21 @@ public static void removeIntermediateMappingsLevels(ObjectNode root) {
* - [{"doc":{"properties":{"address":{"type":"text"}}}}]
* - [{"audit_message":{"properties":{"address":{"type":"text"}}}}]
*
* It's not impossible that the intermediate key is not present, in which case we should just extract the mappings:
* It's also possible for this list to be empty, in which case we should set the mappings to an empty node.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it appropriate to replace [] with "[{}]". Is this true for all versions of targets?

Copy link
Member Author

@chelma chelma Sep 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This results in an index creation call that looks like:

curl -X PUT "localhost:92011/index_name" -H "Content-Type: application/json" -d '
{
  "settings": {
        <the settings>
  },
  "mappings": {

  }
}'

I'm not sure if the empty mappings blob is valid for ALL possible targets, but it does appear valid for the targets we currently consider - OS 1.X & OS 2.X

Comment on lines +18 to +28
ObjectNode testNode1 = new ObjectMapper().createObjectNode();
ArrayNode mappingsNode1 = new ObjectMapper().createArrayNode();
ObjectNode docNode1 = new ObjectMapper().createObjectNode();
ObjectNode propertiesNode1 = new ObjectMapper().createObjectNode();
ObjectNode addressNode1 = new ObjectMapper().createObjectNode();
addressNode1.put("type", "text");
propertiesNode1.set("address", addressNode1);
docNode1.set(TransformFunctions.PROPERTIES_KEY_STR, propertiesNode1);
ObjectNode intermediateNode1 = new ObjectMapper().createObjectNode();
intermediateNode1.set("_doc", docNode1);
mappingsNode1.add(intermediateNode1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you replace this w/ inline json that gets parsed?

  var json = 
      "{" +
      "  \"mappings\": [{" +
      "    \"_doc\": {" +
      ...

It's a lot easier to read and to maintain. You can copy-paste literal json from a text editor into your IDE and it should do all of the necessary escaping (IntelliJ does this, I'd hope eclipse & vscode do too).

Then you can just new ObjectMapper().readTree(json) and get the parsed JsonNode back.

Comment on lines +435 to +457
curl -X PUT "localhost:19200/no_mappings_no_docs" -H "Content-Type: application/json" -d '
{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 0
}
}
}'

curl -X PUT "localhost:19200/empty_mappings_no_docs" -H "Content-Type: application/json" -d '
{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 0
}
},
"mappings": {
"properties":{
}
}
}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to see us generate these on-the-fly and then cache them like we do for preloaded data images. In fact, use the preloaded images, then just take a snapshot via a helper program. Make the output of that something that gets cached as a build artifact and then it can become a dependency to the tests. It seems like that could be 1-2 hours of work and removes the risks of many hours of work trying to figure out if there was a mismatch later on.

@chelma chelma deleted the empty_mappings branch October 23, 2024 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants