Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field aliases implementation tracking #31372

Closed
9 tasks done
jtibshirani opened this issue Jun 15, 2018 · 12 comments
Closed
9 tasks done

Field aliases implementation tracking #31372

jtibshirani opened this issue Jun 15, 2018 · 12 comments
Assignees
Labels
Meta :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@jtibshirani
Copy link
Contributor

jtibshirani commented Jun 15, 2018

Main issue: #23714
Feature branch: field-aliases
Documentation: https://github.com/elastic/elasticsearch/blob/master/docs/reference/mapping/types/alias.asciidoc

Summary of technical approach:

  • Introduce a new field mapping, with type: alias. This mapping has largely the same structure and handling as a standard field mapping. In the code it corresponds to a new top-level Mapper type called FieldAliasMapper, that does not inherit from FieldMapper.
  • Resolve field aliases used in search and field capabilities APIs to their concrete fields before executing the requests. In general, this resolution occurs as the request is being translated to lucene. At a code level, we make sure to keep track of aliases in MapperService so that when field aliases are supplied to MapperService#fullName, the correct concrete MappedFieldType is returned. This concrete field type is then used in creating queries, fetching doc values, etc. This PR attempts to remove other ways of accessing a field's MappedFieldType that wouldn't handle aliases, as in DocumentFieldMappers#getMapper.

TODO:

@jtibshirani jtibshirani self-assigned this Jun 15, 2018
@jtibshirani jtibshirani added the :Search Foundations/Mapping Index mappings, including merging and defining field types label Jun 15, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@jtibshirani
Copy link
Contributor Author

jtibshirani commented Jun 21, 2018

I put down some initial thoughts around field alias support in other contexts besides search requests + field capabilities. @jpountz I'm curious as to your opinions, and also whether I'm missing any important APIs.

  • Retrieving field mappings (index/_mapping/_doc/field/...): supported. In particular, the mappings for the field alias would be retrieved (type: alias, path: ...).
  • Using aliases when filtering the fetched _source: not supported. This doesn't make much sense conceptually, as field aliases will never be referenced in a document's source, and would also be challenging to implement in a performant way.
  • Clearing indices caches (index/_cache/clear?fields=...): not sure. This could also be reasonable to support, although it's not clear how useful it would be.
  • Fetching term vectors (_termvectors): not sure. My sense is that it could be reasonable to support, but I also don't have a great understanding of the common use cases.

For clearing caches and term vectors, my current thought is that in this initial version we should focus on search requests (and field capabilities), and can wait for feedback as to whether it's valuable to add support for aliases in other APIs. In general, we won't commit to supporting aliases in all APIs where a field name can be supplied, but explicitly focus on search requests (and closely related APIs like field capabilities) where they are critical for certain functionality.

@colings86
Copy link
Contributor

Retrieving field mappings (index/_mapping/_doc/field/...): supported. In particular, the mappings for the field alias would be retrieved (type: alias, path: ...).

I don't think we should do this one as I think people using this API should be shown the actual mapping of the field (i.e. the fact its an alias field and what it points to) rather than resolving the alias. This fits with the standard mapping API which also does not resolve the alias for alias fields.

Using aliases when filtering the fetched _source: not supported. This doesn't make much sense conceptually, as field aliases will never be referenced in a document's source, and would also be challenging to implement in a performant way.

I agree

Clearing indices caches (index/_cache/clear?fields=...): not sure. This could also be reasonable to support, although it's not clear how useful it would be.

IMO we should error on alias fields here and require the user to clear the cache explicitly on the concrete field. This API is really for admins to use and they should know the concrete field since clearing the cache is usually something that would be done in response to an issue being debugged

Fetching term vectors (_termvectors): not sure. My sense is that it could be reasonable to support, but I also don't have a great understanding of the common use cases.

We may have requests for alias support here later down the line but I agree that we don't need to implement this for the first version

@jpountz
Copy link
Contributor

jpountz commented Jun 22, 2018

@jtibshirani Agreed with your initial thoughts, I added some comments below.

Retrieving field mappings (index/_mapping/_doc/field/...): supported. In particular, the mappings for the field alias would be retrieved (type: alias, path: ...).

I read this as "the get field mappings API will work and return the mapping of the alias" rather than resolving the alias, but @colings86 's comment suggests he is reading it the other way around. I think it should not try to resolve the alias indeed. My reasoning is that mappings not only have search-time implications but also index-time implications. If a field is shown as mapped as long, then it would suggest that it is fine to feed it with long values, which is not correct if the field is an alias.

Field aliases can be used in the capabilities API.

Field capabilities are only about read-only capabilities of fields (on the contrary to mappings) so it makes sense to resolve aliases. This is probably a requirement for Kibana to work transparently with aliases as well.

Clearing indices caches

Agreed with @colings86: this is an admin operation, let's not resolve aliases.

@jtibshirani
Copy link
Contributor Author

jtibshirani commented Jun 22, 2018

Thank you @colings86 and @jpountz for the thoughts! I've updated the issue description.

To clarify my comment around retrieving field mappings: I indeed meant to say 'the get field mappings API will work and return the mapping of the alias'. I agree that it does not make sense to resolve the alias here, but I think users should be able to see the alias field mapping.

@jpountz
Copy link
Contributor

jpountz commented Jun 25, 2018

I agree that it does not make sense to resolve the alias here, but I think users should be able to see the alias field mapping.

👍

@jaymode
Copy link
Member

jaymode commented Jun 29, 2018

Verify that field aliases interact correctly with field-level security.

We discussed this during this weeks security team meeting. If a user has access to the underlying field then the field alias should also be seen. Given that field level security works at the lucene level, field level security execution will not care about the field aliases or perform any filtering based on a alias.

Where things could get messy is how field aliases play with the field filter used in the mapper registry. It would be odd for field level security to be in use and a user can see a field and its alias when getting the document, but they would not be able to see the alias in the mappings api and the other apis that this filter applies to.

@jtibshirani
Copy link
Contributor Author

jtibshirani commented Jun 29, 2018

Thanks @jaymode for the update. To summarize my understanding from your comment/ watching the meeting recording:

  • A user will only be able to set security on a concrete field, not an alias. This will be sufficient for enforcing security during search requests, since aliases are always translated to concrete fields before calling into lucene, and field level security is performed at the lucene level.
  • When executing a get mappings request, we currently filter out fields that the user can't access. As it currently stands, we wouldn't filter out field aliases, even if the user couldn't see the underlying field. This could result in inconsistency, as for example the _field_caps endpoint would return capabilities when provided with an alias, but not a concrete field.

I'll give some thought as to how to address the last point. It'd be useful to know for context: is the existence of a field considered sensitive information? If not, is its mapping/ schema information sensitive, or is the field mappings filter mostly for better user experience?

@jaymode
Copy link
Member

jaymode commented Jul 3, 2018

@jtibshirani You're understanding is correct.

It'd be useful to know for context: is the existence of a field considered sensitive information? If not, is its mapping/ schema information sensitive, or is the field mappings filter mostly for better user experience?

The best thing would be to not allow the knowledge of the existence of a field to exist and if that isn't possible not knowing the mapping would be better than nothing. Ultimately we want our FLS implementation to prevent the user from knowing that a given field exists in a document. The field mappings filter is a relatively new thing and prior to that users with access to certain APIs could still find out about fields that they did not have access to, so it is not the end of the world if this is not a problem that can be solved.

@jtibshirani
Copy link
Contributor Author

👍 thanks!

@jtibshirani
Copy link
Contributor Author

An update on field-level security: I uploaded a PR for consideration (#31807).

@jtibshirani
Copy link
Contributor Author

This feature branch was merged into master in #32172.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Meta :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

6 participants