Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for transformation of elasticsearch data document mappings #409

Merged
merged 3 commits into from
Apr 7, 2020
Merged

Add support for transformation of elasticsearch data document mappings #409

merged 3 commits into from
Apr 7, 2020

Conversation

Tellan
Copy link

@Tellan Tellan commented Apr 7, 2020

Problem:

A problem that's been on our backlog for a long time is sorting results from elasticsearch. Keyword fields are the only fields that can be sorted on natively, and the standard analysis of the fields makes the distinction between "jacob" and "Jacob". Normal string sorting will sort capital letters first, then lowercase. Which means "jacob" will end up at the bottom of the list, miles away from "Jacob".

In order to fix this, the mapping on the index needs to be updated so that the field you sort on is normalized without case. With a keyword normalizer the value of the field remains as it was inputted, but when doing aggregations and sorts, the field uses its normalized version. This is a post that explains how to set up a normalizer to treat keyword values as lowercase:

https://discuss.elastic.co/t/case-insensitive-sort-doesnt-work/143192/7

Solution(s):

I considered a few ways to add this mapping into the moqui elastic facade, from most complicated, to most broad:

  1. Either create entities or use a json string field to store extra mapping data for fields. There would also need to be a place to hold the "settings" field of the elasticsearch index creation
  2. Code the normalizer into ElasticFacade and add a field to DataDocumentField to set the normalizer for that field.
  3. Add a field to DataDocument for a manual mapping service that could transform the mapping before ElasticFacade creates the index.

I decided that to make this most re-usable and quick to implement, the third option would handle any possible case. If we decided to model field mappings later, we could also add that on top.

The service takes the normally generated mapping for the index and returns it with the same name and an optional settings map. The settings map gets added to the createIndex request. Unfortunately, in order to update analyzers and normalizers in the index settings, the index needs to be closed. We could code that, but it would case a little interruption in the index.

Let me know if you have any questions, suggestions, or want to see an example of a use case for this.

Through a new manualMappingServiceName field on DataDocument. This
allows datadocuments be be freely configured with more advanced
elasticsearch mappings.
@jonesde
Copy link
Member

jonesde commented Apr 7, 2020

These changes look fine and can be merged as-is. The one addition I would do to it (feel free to or I can do it after the merge) is to add a service interface to clarify the in and out parameters supported (looks like mapping in, mapping and settings out, all type Map), just like for the manualDataServiceName as mentioned in the field description.

Before merging the one thing that is needed is for you to 'sign' the AUTHORS, both sections, with your name and github username. That is usually easiest to do in your fork with a new commit and then update this pull request.

@Tellan
Copy link
Author

Tellan commented Apr 7, 2020

Added the interface and signed AUTHORS. Called it org.moqui.EntityServies.transform#DocumentMapping but you're welcome to rename it.

@jonesde jonesde merged commit a46a9cd into moqui:master Apr 7, 2020
@jonesde
Copy link
Member

jonesde commented Apr 7, 2020

Changes merged.

On a side note sooner or later I want to explore better solutions for sorting results from ES and have considered something like Solution #2 where the framework adds both text and keyword type fields. I know this is buried in a random place, but there are some notes on it in the comments here:

https://github.com/moqui/SimpleScreens/blob/master/template/product/ProductTransitions.xml#L81

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants