Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easier to deprecate analysis components #42349

Closed
romseygeek opened this issue May 22, 2019 · 1 comment · Fixed by #50908
Closed

Make it easier to deprecate analysis components #42349

romseygeek opened this issue May 22, 2019 · 1 comment · Fixed by #50908
Assignees
Labels
>refactoring :Search Relevance/Analysis How text is split into tokens Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch

Comments

@romseygeek
Copy link
Contributor

We have a number of tokenfilters that we should deprecate in favour of newer functionality, for example keyword_repeat should be replaced by multiplexer, shingle and edgengram by index_phrases and index_prefixes options, etc. Marking these as deprecated is currently made difficult by the way that preconfigured components are built.

Ideally, we should issue deprecation warnings when component factories are created. However, because all preconfigured factories are constructed up-front by the AnalysisRegistry, a deprecation warning on, e.g. keyword_repeat will be emitted for every new index mapping, whether or not that mapping refers to keyword_repeat.

We should change AnalysisRegistry to only build component factories when they are explicitly specified in mappings; as well as making for better deprecations, this should also allow us to save some memory by reducing the number of unused factories built per-index.

@romseygeek romseygeek added :Search Relevance/Analysis How text is split into tokens >refactoring labels May 22, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@romseygeek romseygeek self-assigned this May 22, 2019
romseygeek added a commit that referenced this issue Jan 14, 2020
Generally speaking, deprecated analysis components in elasticsearch will issue deprecation
warnings when they are first used. However, this means that no warnings are emitted when
indexes are created with deprecated components, and users have to actually index a document
to see warnings. This makes it much harder to see these warnings and act on them at
appropriate times.

This is worse in the case where components throw exceptions on upgrade. In this case, users
will not be aware of a problem until a document is indexed, instead of at index creation time.

This commit adds a new check that pushes an empty string through all user-defined analyzers
and normalizers when an IndexAnalyzers object is built for each index; deprecation warnings
and exceptions are now emitted when indexes are created or opened.

Fixes #42349
romseygeek added a commit that referenced this issue Jan 14, 2020
Generally speaking, deprecated analysis components in elasticsearch will issue deprecation
warnings when they are first used. However, this means that no warnings are emitted when
indexes are created with deprecated components, and users have to actually index a document
to see warnings. This makes it much harder to see these warnings and act on them at
appropriate times.

This is worse in the case where components throw exceptions on upgrade. In this case, users
will not be aware of a problem until a document is indexed, instead of at index creation time.

This commit adds a new check that pushes an empty string through all user-defined analyzers
and normalizers when an IndexAnalyzers object is built for each index; deprecation warnings
and exceptions are now emitted when indexes are created or opened.

Fixes #42349
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this issue Jan 23, 2020
Generally speaking, deprecated analysis components in elasticsearch will issue deprecation
warnings when they are first used. However, this means that no warnings are emitted when
indexes are created with deprecated components, and users have to actually index a document
to see warnings. This makes it much harder to see these warnings and act on them at
appropriate times.

This is worse in the case where components throw exceptions on upgrade. In this case, users
will not be aware of a problem until a document is indexed, instead of at index creation time.

This commit adds a new check that pushes an empty string through all user-defined analyzers
and normalizers when an IndexAnalyzers object is built for each index; deprecation warnings
and exceptions are now emitted when indexes are created or opened.

Fixes elastic#42349
@javanna javanna added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>refactoring :Search Relevance/Analysis How text is split into tokens Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants