-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NullPointerException in ElisionFilter on _analyze #43002
Labels
>bug
:Search Relevance/Analysis
How text is split into tokens
Team:Search Relevance
Meta label for the Search Relevance team in Elasticsearch
Comments
Pinging @elastic/es-search |
Elision filter needs an 'articles' setting to work properly, which is missing from the |
romseygeek
added a commit
that referenced
this issue
Jun 27, 2019
We should throw an exception at construction time if a list of articles is not provided, otherwise we can get random NPEs during indexing. Relates to #43002
romseygeek
added a commit
that referenced
this issue
Jun 27, 2019
When a named token filter or char filter is passed as part of an Analyze API request with no index, we currently try and build the relevant filter using no index settings. However, this can miss cases where there is a pre-configured filter defined in the analysis registry. One example here is the elision filter, which has a pre-configured version built with the french elision set; when used as part of normal analysis, this preconfigured set is used, but when used as part of the Analyze API we end up with NPEs because it tries to instantiate the filter with no index settings. This commit changes the Analyze API to check for pre-configured filters in the case that the request has no index defined, and is using a name rather than a custom definition for a filter. It also changes the pre-configured `word_delimiter_graph` filter and `edge_ngram` tokenizer to make their settings consistent with the defaults used when creating them with no settings Closes #43002 Closes #43621 Closes #43582
romseygeek
added a commit
that referenced
this issue
Jun 27, 2019
We should throw an exception at construction time if a list of articles is not provided, otherwise we can get random NPEs during indexing. Relates to #43002
romseygeek
added a commit
that referenced
this issue
Jun 27, 2019
When a named token filter or char filter is passed as part of an Analyze API request with no index, we currently try and build the relevant filter using no index settings. However, this can miss cases where there is a pre-configured filter defined in the analysis registry. One example here is the elision filter, which has a pre-configured version built with the french elision set; when used as part of normal analysis, this preconfigured set is used, but when used as part of the Analyze API we end up with NPEs because it tries to instantiate the filter with no index settings. This commit changes the Analyze API to check for pre-configured filters in the case that the request has no index defined, and is using a name rather than a custom definition for a filter. It also changes the pre-configured `word_delimiter_graph` filter and `edge_ngram` tokenizer to make their settings consistent with the defaults used when creating them with no settings Closes #43002 Closes #43621 Closes #43582
javanna
added
the
Team:Search Relevance
Meta label for the Search Relevance team in Elasticsearch
label
Jul 16, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
>bug
:Search Relevance/Analysis
How text is split into tokens
Team:Search Relevance
Meta label for the Search Relevance team in Elasticsearch
Elasticsearch version (
bin/elasticsearch --version
):Version: 7.1.1, Build: default/docker/7a013de/2019-05-23T14:04:00.380842Z, JVM: 12.0.1
(but happens in older versions too)
Steps to reproduce:
output:
Provide logs (if relevant):
Few words of context:
Elision filter seems to work fine to me in regular analyzer (created at the index creation time), so it's probably something related to the way filters are used by analyze API.
The text was updated successfully, but these errors were encountered: