Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for value aliases/data deduplication #39758

Closed
vbohata opened this issue Mar 6, 2019 · 6 comments
Closed

Add support for value aliases/data deduplication #39758

vbohata opened this issue Mar 6, 2019 · 6 comments
Labels
>feature :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@vbohata
Copy link

vbohata commented Mar 6, 2019

We often face the situation of renaming some fields while keeping the old named fields for some time. This will be much worse as ECS will gets more widely used.
Sometimes it is possible to solve it by aliases, sometimes it is not - situations like "dynamic data" or if needed to do it in already existent index without reindexing/creating new one. In these situations we copy fields in logstash which results to bigger indexes with duplicated data.

Elasticsearch could provide a feature similar to hard links in filesystem - allow to specify links to fields for each single document. So indexing like field1:"somevalue", field2:linkto:field1. Or it could be fully automatic - if there are 4 fields with the same data type and exactly same value, store it just once. This feature would help to solve many troubles we are facing to.

@polyfractal
Copy link
Contributor

I think you can accomplish this with the new(ish) alias data type: https://www.elastic.co/guide/en/elasticsearch/reference/master/alias.html

These are essentially field aliases that can be added to an existing mapping, pointing at an existing field. Is that what you're thinking?

Support was added in 6.4 (#32172)

@polyfractal polyfractal added feedback_needed :Search Foundations/Mapping Index mappings, including merging and defining field types labels Mar 6, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@vbohata
Copy link
Author

vbohata commented Mar 6, 2019

Alias data type is template/index level, not document level.

@mayya-sharipova
Copy link
Contributor

mayya-sharipova commented Mar 7, 2019

@vbohata Lucene ( underlying storage Engine) organizes data by fields. So, there is a disk block corresponding to one field for all documents, and there is another disck block correspoding to another field for all documents. We can't tell Lucene for the same field depending on a document number read data from one block or another.
So, if I understood you correctly, what you are asking can't be done.

@vbohata
Copy link
Author

vbohata commented Apr 7, 2019

OK. So for this feature to be implemented there has to be either some support in Lucene or some kind of intermediate layer in Elasticsearch (somefield in ES -> somefield_1, somefield_2, somefield_dedup in Lucene ... this way it would be globally configured per index but used independently for each document).

@jtibshirani
Copy link
Contributor

jtibshirani commented Apr 10, 2019

As @mayya-sharipova pointed out, with the way data is organized this feature would be difficult to fit in the current architecture. The recommended approach in your set-up is either to duplicate the data (as you are doing with Logstash), or to perform a reindex and use field aliases for backwards compatibility.

Thanks @vbohata for your suggestion -- if there is more interest in this request or something changes in our thinking, we can re-open the issue.

@javanna javanna added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

6 participants