NullPointerException in ElisionFilter on _analyze #43002

telendt · 2019-06-07T19:17:40Z

Elasticsearch version (bin/elasticsearch --version):
Version: 7.1.1, Build: default/docker/7a013de/2019-05-23T14:04:00.380842Z, JVM: 12.0.1
(but happens in older versions too)

Steps to reproduce:

curl -sH 'Content-Type: application/json' 'localhost:9200/_analyze' -d '
{
  "text": "l’avion",
  "tokenizer": "standard",
  "filter": ["elision"]
}' | jq .

output:

{
  "error": {
    "root_cause": [
      {
        "type": "remote_transport_exception",
        "reason": "[12241dcbb809][172.20.0.2:9300][indices:admin/analyze[s]]"
      }
    ],
    "type": "null_pointer_exception",
    "reason": null
  },
  "status": 500
}

Provide logs (if relevant):

{
   "type":"server",
   "timestamp":"2019-06-07T19:15:38,854+0000",
   "level":"WARN",
   "component":"r.suppressed",
   "cluster.name":"docker-search-cluster",
   "node.name":"12241dcbb809",
   "cluster.uuid":"R-Zf6up6TXiv9aUhvgK03A",
   "node.id":"Fc9JysDfQm6Rp9FqaCu_Cg",
   "message":"path: /_analyze, params: {}",
   "stacktrace":[
      "org.elasticsearch.transport.RemoteTransportException: [12241dcbb809][172.20.0.2:9300][indices:admin/analyze[s]]",
      "Caused by: java.lang.NullPointerException",
      "at org.apache.lucene.analysis.util.ElisionFilter.incrementToken(ElisionFilter.java:66) ~[lucene-analyzers-common-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979 - jimczi - 2019-03-08 11:59:47]",
      "at org.elasticsearch.action.admin.indices.analyze.TransportAnalyzeAction.simpleAnalyze(TransportAnalyzeAction.java:276) ~[elasticsearch-7.1.1.jar:7.1.1]",
      "at org.elasticsearch.action.admin.indices.analyze.TransportAnalyzeAction.analyze(TransportAnalyzeAction.java:251) ~[elasticsearch-7.1.1.jar:7.1.1]",
      "at org.elasticsearch.action.admin.indices.analyze.TransportAnalyzeAction.shardOperation(TransportAnalyzeAction.java:170) ~[elasticsearch-7.1.1.jar:7.1.1]",
      "at org.elasticsearch.action.admin.indices.analyze.TransportAnalyzeAction.shardOperation(TransportAnalyzeAction.java:81) ~[elasticsearch-7.1.1.jar:7.1.1]",
      "at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$1.doRun(TransportSingleShardAction.java:117) [elasticsearch-7.1.1.jar:7.1.1]",
      "at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) [elasticsearch-7.1.1.jar:7.1.1]",
      "at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.1.1.jar:7.1.1]",
      "at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]",
      "at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]",
      "at java.lang.Thread.run(Thread.java:835) [?:?]"
   ]
}

Few words of context:
Elision filter seems to work fine to me in regular analyzer (created at the index creation time), so it's probably something related to the way filters are used by analyze API.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-06-07T19:49:12Z

Pinging @elastic/es-search

romseygeek · 2019-06-10T11:00:59Z

Elision filter needs an 'articles' setting to work properly, which is missing from the analyze HTTP call you're making. It definitely shouldn't be producing an NPE though, it should give a more informative error at construction time.

We should throw an exception at construction time if a list of articles is not provided, otherwise we can get random NPEs during indexing. Relates to #43002

When a named token filter or char filter is passed as part of an Analyze API request with no index, we currently try and build the relevant filter using no index settings. However, this can miss cases where there is a pre-configured filter defined in the analysis registry. One example here is the elision filter, which has a pre-configured version built with the french elision set; when used as part of normal analysis, this preconfigured set is used, but when used as part of the Analyze API we end up with NPEs because it tries to instantiate the filter with no index settings. This commit changes the Analyze API to check for pre-configured filters in the case that the request has no index defined, and is using a name rather than a custom definition for a filter. It also changes the pre-configured `word_delimiter_graph` filter and `edge_ngram` tokenizer to make their settings consistent with the defaults used when creating them with no settings Closes #43002 Closes #43621 Closes #43582

We should throw an exception at construction time if a list of articles is not provided, otherwise we can get random NPEs during indexing. Relates to #43002

When a named token filter or char filter is passed as part of an Analyze API request with no index, we currently try and build the relevant filter using no index settings. However, this can miss cases where there is a pre-configured filter defined in the analysis registry. One example here is the elision filter, which has a pre-configured version built with the french elision set; when used as part of normal analysis, this preconfigured set is used, but when used as part of the Analyze API we end up with NPEs because it tries to instantiate the filter with no index settings. This commit changes the Analyze API to check for pre-configured filters in the case that the request has no index defined, and is using a name rather than a custom definition for a filter. It also changes the pre-configured `word_delimiter_graph` filter and `edge_ngram` tokenizer to make their settings consistent with the defaults used when creating them with no settings Closes #43002 Closes #43621 Closes #43582

jaymode added the :Search Relevance/Analysis How text is split into tokens label Jun 7, 2019

jimczi added the >bug label Jun 10, 2019

romseygeek self-assigned this Jun 10, 2019

romseygeek mentioned this issue Jun 11, 2019

Require [articles] setting in elision filter #43083

Merged

romseygeek mentioned this issue Jun 25, 2019

Use preconfigured filters correctly in Analyze API #43568

Merged

romseygeek closed this as completed in #43083 Jun 27, 2019

romseygeek added a commit that referenced this issue Jun 27, 2019

Require [articles] setting in elision filter (#43083)

d2c696d

We should throw an exception at construction time if a list of articles is not provided, otherwise we can get random NPEs during indexing. Relates to #43002

romseygeek added a commit that referenced this issue Jun 27, 2019

Require [articles] setting in elision filter (#43083)

05a7333

We should throw an exception at construction time if a list of articles is not provided, otherwise we can get random NPEs during indexing. Relates to #43002

javanna added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jul 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NullPointerException in ElisionFilter on _analyze #43002

NullPointerException in ElisionFilter on _analyze #43002

telendt commented Jun 7, 2019 •

edited

Loading

elasticmachine commented Jun 7, 2019

romseygeek commented Jun 10, 2019

NullPointerException in ElisionFilter on _analyze #43002

NullPointerException in ElisionFilter on _analyze #43002

Comments

telendt commented Jun 7, 2019 • edited Loading

elasticmachine commented Jun 7, 2019

romseygeek commented Jun 10, 2019

telendt commented Jun 7, 2019 •

edited

Loading