Add support for transformation of elasticsearch data document mappings #409

Tellan · 2020-04-07T00:37:16Z

Problem:

A problem that's been on our backlog for a long time is sorting results from elasticsearch. Keyword fields are the only fields that can be sorted on natively, and the standard analysis of the fields makes the distinction between "jacob" and "Jacob". Normal string sorting will sort capital letters first, then lowercase. Which means "jacob" will end up at the bottom of the list, miles away from "Jacob".

In order to fix this, the mapping on the index needs to be updated so that the field you sort on is normalized without case. With a keyword normalizer the value of the field remains as it was inputted, but when doing aggregations and sorts, the field uses its normalized version. This is a post that explains how to set up a normalizer to treat keyword values as lowercase:

https://discuss.elastic.co/t/case-insensitive-sort-doesnt-work/143192/7

Solution(s):

I considered a few ways to add this mapping into the moqui elastic facade, from most complicated, to most broad:

Either create entities or use a json string field to store extra mapping data for fields. There would also need to be a place to hold the "settings" field of the elasticsearch index creation
Code the normalizer into ElasticFacade and add a field to DataDocumentField to set the normalizer for that field.
Add a field to DataDocument for a manual mapping service that could transform the mapping before ElasticFacade creates the index.

I decided that to make this most re-usable and quick to implement, the third option would handle any possible case. If we decided to model field mappings later, we could also add that on top.

The service takes the normally generated mapping for the index and returns it with the same name and an optional settings map. The settings map gets added to the createIndex request. Unfortunately, in order to update analyzers and normalizers in the index settings, the index needs to be closed. We could code that, but it would case a little interruption in the index.

Let me know if you have any questions, suggestions, or want to see an example of a use case for this.

Through a new manualMappingServiceName field on DataDocument. This allows datadocuments be be freely configured with more advanced elasticsearch mappings.

jonesde · 2020-04-07T21:27:17Z

These changes look fine and can be merged as-is. The one addition I would do to it (feel free to or I can do it after the merge) is to add a service interface to clarify the in and out parameters supported (looks like mapping in, mapping and settings out, all type Map), just like for the manualDataServiceName as mentioned in the field description.

Before merging the one thing that is needed is for you to 'sign' the AUTHORS, both sections, with your name and github username. That is usually easiest to do in your fork with a new commit and then update this pull request.

…viceName

Tellan · 2020-04-07T21:49:07Z

Added the interface and signed AUTHORS. Called it org.moqui.EntityServies.transform#DocumentMapping but you're welcome to rename it.

jonesde · 2020-04-07T21:56:58Z

Changes merged.

On a side note sooner or later I want to explore better solutions for sorting results from ES and have considered something like Solution #2 where the framework adds both text and keyword type fields. I know this is buried in a random place, but there are some notes on it in the comments here:

https://github.com/moqui/SimpleScreens/blob/master/template/product/ProductTransitions.xml#L81

Add support for transformation of elasticsearch data document mappings

754dcc7

Through a new manualMappingServiceName field on DataDocument. This allows datadocuments be be freely configured with more advanced elasticsearch mappings.

Jacob Barnes added 2 commits April 7, 2020 14:46

Add interface describing parameters for DataDocument.manualMappingSer…

e09fcfb

…viceName

Sign AUTHORS

83232cf

jonesde merged commit a46a9cd into moqui:master Apr 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for transformation of elasticsearch data document mappings #409

Add support for transformation of elasticsearch data document mappings #409

Tellan commented Apr 7, 2020

jonesde commented Apr 7, 2020

Tellan commented Apr 7, 2020

jonesde commented Apr 7, 2020

Add support for transformation of elasticsearch data document mappings #409

Add support for transformation of elasticsearch data document mappings #409

Conversation

Tellan commented Apr 7, 2020

jonesde commented Apr 7, 2020

Tellan commented Apr 7, 2020

jonesde commented Apr 7, 2020