Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composable templates merge dynamic templates in reverse order #76702

Open
jacksmith15 opened this issue Aug 19, 2021 · 6 comments
Open

Composable templates merge dynamic templates in reverse order #76702

jacksmith15 opened this issue Aug 19, 2021 · 6 comments
Labels
>bug :Data Management/Indices APIs APIs to create and manage indices and templates Team:Data Management Meta label for data/management team

Comments

@jacksmith15
Copy link

jacksmith15 commented Aug 19, 2021

Elasticsearch version 7.13.4 (testing using official docker image)

Plugins installed: []

JVM version (java -version): 16+36

OS version (uname -a if on a Unix-like system): Linux d1da479e9601 5.8.0-63-generic #71-Ubuntu SMP Tue Jul 13 15:59:12 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:

According to the documentation:

When multiple component templates are specified in the composed_of field for an index template, they are merged in the order specified, meaning that later component templates override earlier component templates. Any mappings, settings, or aliases from the parent index template are merged in next. Finally, any configuration on the index request itself is merged.

The following is also stated in the API documentation:

the last component template specified has the highest precedence

However this is not true for dynamic templates, as dynamic templates from higher precedence component templates are added to the end of the list, which is actually lower precedence, as described in the dynamic templates documentation:

Templates are processed in order — the first matching template wins.

This seems to occur because a generic dictionary merge function is used to resolve templates, rather than one which is context-aware to template semantics.

Steps to reproduce:

For simplicity, I will show a single template with index mapping overrides, but the same is true for the relationship between multiple component templates, or a component template and an index template (happy to provide additional examples if required):

# Create a base template which maps keyword type from property name prefix
curl -XPUT "http://elasticsearch:9200/_index_template/base_template" -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["scope-*"],
  "template": {
    "mappings": {
      "dynamic_templates": [
        {
          "keyword": {
            "match": "prefix-*",
            "mapping": {
              "type": "keyword"
            }
          }
        }
      ]
    }
  }
}'

# Create a concrete index which adds a more specific dynamic template
curl -XPUT "http://elasticsearch:9200/scope-index-name" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index": {
      "analysis": {
        "normalizer": {
          "lowercase_normalizer": {
            "type": "custom",
            "char_filter": [],
            "filter": ["lowercase"]
          }
        }
      }
    }
  },
  "mappings": {
    "dynamic_templates": [
      {
        "my_field": {
          "match": "prefix-longer-*",
          "mapping": {
            "type": "keyword",
            "normalizer": "lowercase_normalizer"
          }
        }
      }
    ]
  }
}'

# Index a document with a field matching both dynamic templates
curl -XPUT "http://elasticsearch:9200/scope-index-name/_doc/123" -H 'Content-Type: application/json' -d'
{
  "prefix-longer-abc": "FooBar"
}'

# Get the field mapping generated by the document
curl -XGET "http://elasticsearch:9200/scope-index-name/_mapping/field/prefix-longer-abc"

The result is that the field uses the mapping from the lower precedence template:

{
  "scope-index-name" : {
    "mappings" : {
      "prefix-longer-abc" : {
        "full_name" : "prefix-longer-abc",
        "mapping" : {
          "prefix-longer-abc" : {
            "type" : "keyword"
          }
        }
      }
    }
  }
}

Examination of the index mappings show that this is because the dynamic templates have been merged with higher-precedence last:

curl -XGET "http://elasticsearch:9200/scope-index-name/_mapping/"
{
  "scope-index-name" : {
    "mappings" : {
      "dynamic_templates" : [
        {
          "keyword" : {
            "match" : "prefix-*",
            "mapping" : {
              "type" : "keyword"
            }
          }
        },
        {
          "my_field" : {
            "match" : "prefix-longer-*",
            "mapping" : {
              "normalizer" : "lowercase_normalizer",
              "type" : "keyword"
            }
          }
        }
      ]
    }
  }
}

For context the legacy template API produces the opposite result:

Legacy template example
# Create a base template which maps keyword type from property name prefix
curl -XPUT "http://elasticsearch:9200/_template/base_template" -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["scope-*"],
  "mappings": {
    "dynamic_templates": [
      {
        "keyword": {
          "match": "prefix-*",
          "mapping": {
            "type": "keyword"
          }
        }
      }
    ]
  }
}'

# Create a concrete index which adds a more specific dynamic template
curl -XPUT "http://elasticsearch:9200/scope-index-name" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index": {
      "analysis": {
        "normalizer": {
          "lowercase_normalizer": {
            "type": "custom",
            "char_filter": [],
            "filter": ["lowercase"]
          }
        }
      }
    }
  },
  "mappings": {
    "dynamic_templates": [
      {
        "my_field": {
          "match": "prefix-longer-*",
          "mapping": {
            "type": "keyword",
            "normalizer": "lowercase_normalizer"
          }
        }
      }
    ]
  }
}'

# Index a document with a field matching both dynamic templates
curl -XPUT "http://elasticsearch:9200/scope-index-name/_doc/123" -H 'Content-Type: application/json' -d'
{
  "prefix-longer-abc": "FooBar"
}'

# Get the field mapping generated by the document
curl -XGET "http://elasticsearch:9200/scope-index-name/_mapping/field/prefix-longer-abc"

This results in the field using the mapping from the higher precedence template:

{
  "scope-index-name" : {
    "mappings" : {
      "prefix-longer-abc" : {
        "full_name" : "prefix-longer-abc",
        "mapping" : {
          "prefix-longer-abc" : {
            "type" : "keyword",
            "normalizer" : "lowercase_normalizer"
          }
        }
      }
    }
  }
}

Additional notes

  • The merge logic was implemented in this PR, with the merge logic specifically being carried out by this function, called here.
@jacksmith15 jacksmith15 added >bug needs:triage Requires assignment of a team area label labels Aug 19, 2021
@pgomulka pgomulka added the :Search Foundations/Mapping Index mappings, including merging and defining field types label Aug 19, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Aug 19, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@pgomulka pgomulka removed the needs:triage Requires assignment of a team area label label Aug 19, 2021
@jtibshirani jtibshirani added the :Data Management/Indices APIs APIs to create and manage indices and templates label Aug 19, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Aug 19, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@jtibshirani
Copy link
Contributor

jtibshirani commented Sep 16, 2021

First, summarizing the problem since it's a bit tricky: dynamic templates are defined in a list, and when deciding which one to apply for a field, we always select the first matching dynamic template. When merging composable template mappings to produce the final mapping, we append the higher-order template's dynamic templates at the end of the list. This means that its dynamic templates actually have lower precedence than earlier templates, which goes against the idea that mappings in higher-order templates should always take precedence. With legacy templates, the merge strategy is less sophisticated, but it happens to add dynamic templates for the the higher-order template at the start of the list.

To me this seems like a bug in composable template merging -- if higher-order mappings are supposed to take higher precedence, then their dynamic templates should be given higher precedence too. The fix would be to add the dynamic templates at the start of the list (instead of end). Unfortunately this fix breaks existing behavior, so we'd need to think through how to introduce it. Maybe it'd be fine to just change the behavior in 8.0, avoiding any changes in 7.x. As part of the fix we could double-check the behavior for any other mapping components that are arrays.

@jacksmith15
Copy link
Author

Perhaps it would be good to update the docs as a minimum? At the moment they are don't reflect the behaviour (and it certainly cost me some development time to work that out!).

@jtibshirani
Copy link
Contributor

@jacksmith15 this is a good point, but I'm struggling to see where it's appropriate to document this... it would be as if we were documenting buggy behavior. In any case, I'll try to make some progress on this issue.

@javanna javanna removed :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team labels Jun 13, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Indices APIs APIs to create and manage indices and templates Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

6 participants