Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a null pointer check to fix index_prefix query #2879

Merged
merged 2 commits into from
Apr 14, 2022

Conversation

VachaShah
Copy link
Collaborator

Signed-off-by: Vacha Shah vachshah@amazon.com

Description

When the length of the query prefix is = minChars-1, the code follows the path in TextFieldMapper and throws an NPE here. Adding a null check for the method, the query returns a result without an NPE now:

curl -XPUT localhost:9200/test --data '{                                                                             
"mappings": {
    "properties": {
      "t": {
        "type": "text",
        "index_prefixes": { "min_chars": "3" }
      }
    }
  }
}' -H "Content-Type:Application/json"
{{"acknowledged":true,"shards_acknowledged":true,"index":"test"}
curl 'localhost:9200/test/_search?pretty' --data '{"query":{"prefix":{"t": "st"}}}' -H "Content-Type:Application/json"
{
  "took" : 9,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

Tested with various scenarios mentioned in the issue. Adding the details in the comments of this PR.

Issues Resolved

#2826

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@VachaShah VachaShah requested a review from a team as a code owner April 12, 2022 21:01
@VachaShah
Copy link
Collaborator Author

For index test with minChars=3 for index_prefixes:

curl -XPUT localhost:9200/test --data '{                                                                             
"mappings": {
    "properties": {
      "t": {
        "type": "text",
        "index_prefixes": { "min_chars": "3" }
      }
    }
  }
}' -H "Content-Type:Application/json"
{{"acknowledged":true,"shards_acknowledged":true,"index":"test"}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test/_search?pretty' --data '{"query":{"prefix":{"t": ""}}}' -H "Content-Type:Application/json"                                                             
{
  "took" : 38,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test/_search?pretty' --data '{"query":{"prefix":{"t": "s"}}}' -H "Content-Type:Application/json"
{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test/_search?pretty' --data '{"query":{"prefix":{"t": "st"}}}' -H "Content-Type:Application/json"
{
  "took" : 9,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test/_search?pretty' --data '{"query":{"prefix":{"t": "str"}}}' -H "Content-Type:Application/json"
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

For index test-2 with minChars=2 for index_prefixes:

curl -XPUT localhost:9200/test-2 --data '{                                                                             
"mappings": {
    "properties": {
      "t": {
        "type": "text",
        "index_prefixes": { "min_chars": "2" }
      }
    }
  }
}' -H "Content-Type:Application/json"
{"acknowledged":true,"shards_acknowledged":true,"index":"test-2"}ubuntu@ip-172-31-51-78:~$ 
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test-2/_search?pretty' --data '{"query":{"prefix":{"t": ""}}}' -H "Content-Type:Application/json"
{
  "took" : 36,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test-2/_search?pretty' --data '{"query":{"prefix":{"t": "s"}}}' -H "Content-Type:Application/json"
{
  "took" : 10,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test-2/_search?pretty' --data '{"query":{"prefix":{"t": "st"}}}' -H "Content-Type:Application/json"
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test-2/_search?pretty' --data '{"query":{"prefix":{"t": "str"}}}' -H "Content-Type:Application/json"
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

For index test with minChars=5 for index_prefixes:

curl -XPUT localhost:9200/test-3 --data '{                                                                             
"mappings": {
    "properties": {
      "t": {
        "type": "text",
        "index_prefixes": { "min_chars": "5" }
      }
    }
  }
}' -H "Content-Type:Application/json"
{"acknowledged":true,"shards_acknowledged":true,"index":"test-3"}ubuntu@ip-172-31-51-78:~$ 
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test-3/_search?pretty' --data '{"query":{"prefix":{"t": ""}}}' -H "Content-Type:Application/json"
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test-3/_search?pretty' --data '{"query":{"prefix":{"t": "s"}}}' -H "Content-Type:Application/json"
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test-3/_search?pretty' --data '{"query":{"prefix":{"t": "st"}}}' -H "Content-Type:Application/json"
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test-3/_search?pretty' --data '{"query":{"prefix":{"t": "str"}}}' -H "Content-Type:Application/json"
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test-3/_search?pretty' --data '{"query":{"prefix":{"t": "strt"}}}' -H "Content-Type:Application/json"
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}
ubuntu@ip-172-31-51-78:~$ curl 'localhost:9200/test-3/_search?pretty' --data '{"query":{"prefix":{"t": "strtt"}}}' -H "Content-Type:Application/json"
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

Copy link
Member

@saratvemulapalli saratvemulapalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are faster than me :)
Thanks for taking care of this.

I think the right fix is to take care of the acceptance check.
Also could you add tests to make sure we dont break it in the future?

@@ -583,7 +583,9 @@ public Query prefixQuery(String value, MultiTermQuery.RewriteMethod method, bool
}
Automaton automaton = Operations.concatenate(automata);
AutomatonQuery query = new AutomatonQuery(new Term(name(), value + "*"), automaton);
query.setRewriteMethod(method);
if (method != null) {
Copy link
Member

@saratvemulapalli saratvemulapalli Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach solves the problem but the real root cause is the check for min, max chars.

The right fix here is to not accept() the query if the len is less than min_chars.
This makes sure the accept fails at: Line-762 and doesnt really need to call: Line-765

Suggested change
if (method != null) {
boolean accept(int length) {
return length >= minChars && length <= maxChars;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't know we should not be accepting the value less than minChars. I will make that change and add tests as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also take a look at: dd540ef.
There might be other cases which might fail.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the support for minChars - 1 was specifically added. Would the fix in this PR then be the solution since it changes the behavior if we don't accept the minChars - 1 length?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fix the original commit put in lead to this problem. My hunch says this should be fixed.
Could you read through the commit and understand what other problems we will see?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the commit, the operation for the index_prefix query for length = minChars-1 is remapped to a? from a* to make it less expensive. Also, looking at the current code base, I see that the setRewriteMethod is wrapped by a null check at other places: for example https://github.com/opensearch-project/OpenSearch/blame/3c5d997a765e24ffa32d35219fd5026cfb143a9d/server/src/main/java/org/opensearch/index/mapper/StringFieldType.java#L116.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Vacha for this.
I take back my suggestion, as its probably not supporting the optimization in performance.
I agree with you that, verifying the method being null solves the problem while being efficient.

Could you add some tests and we should be good to go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is query.setRewriteMethod(null) a valid thing by itself? Can the rewrite method be null? If not throw in an assert or something like that to prevent hiding other errors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null looks to be valid, also @nknize opened an issue #2896 to remove these usages next since the method is deprecated.

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success b6b9d3561c9869ff26326526b217d1fdab4e3a00
Log 4418

Reports 4418

Signed-off-by: Vacha Shah <vachshah@amazon.com>
Signed-off-by: Vacha Shah <vachshah@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 1c5494b
Log 4452

Reports 4452

@saratvemulapalli
Copy link
Member

Looks good to me. Thanks @VachaShah !

@saratvemulapalli saratvemulapalli added backport 2.x Backport to 2.x branch v3.0.0 Issues and PRs related to version 3.0.0 backport 2.0 Backport to 2.0 branch bug Something isn't working Indexing & Search labels Apr 14, 2022
Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine approving and merging this PR to fix the immediate bug but we should open a follow-up PR for #2896. Let us know if you'd like to take care of that as well @VachaShah

@@ -583,7 +583,9 @@ public Query prefixQuery(String value, MultiTermQuery.RewriteMethod method, bool
}
Automaton automaton = Operations.concatenate(automata);
AutomatonQuery query = new AutomatonQuery(new Term(name(), value + "*"), automaton);
query.setRewriteMethod(method);
if (method != null) {
query.setRewriteMethod(method);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query.setRewriteMethod is removed in Lucene 9.2. I've opened an issue to remove all references to the method and use CONSTANT_SCORE_REWRITE in places where null is passed in as the method parameter.

@VachaShah
Copy link
Collaborator Author

I'm fine approving and merging this PR to fix the immediate bug but we should open a follow-up PR for #2896. Let us know if you'd like to take care of that as well @VachaShah

Thanks @nknize! Sure I can take care of that.

@VachaShah VachaShah merged commit 452e368 into opensearch-project:main Apr 14, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request Apr 14, 2022
* Adding a null pointer check to fix index_prefix query

Signed-off-by: Vacha Shah <vachshah@amazon.com>

* Adding test

Signed-off-by: Vacha Shah <vachshah@amazon.com>
(cherry picked from commit 452e368)
opensearch-trigger-bot bot pushed a commit that referenced this pull request Apr 14, 2022
* Adding a null pointer check to fix index_prefix query

Signed-off-by: Vacha Shah <vachshah@amazon.com>

* Adding test

Signed-off-by: Vacha Shah <vachshah@amazon.com>
(cherry picked from commit 452e368)
saratvemulapalli pushed a commit that referenced this pull request Apr 14, 2022
* Adding a null pointer check to fix index_prefix query

Signed-off-by: Vacha Shah <vachshah@amazon.com>

* Adding test

Signed-off-by: Vacha Shah <vachshah@amazon.com>
(cherry picked from commit 452e368)

Co-authored-by: Vacha Shah <vachshah@amazon.com>
saratvemulapalli pushed a commit that referenced this pull request Apr 14, 2022
* Adding a null pointer check to fix index_prefix query

Signed-off-by: Vacha Shah <vachshah@amazon.com>

* Adding test

Signed-off-by: Vacha Shah <vachshah@amazon.com>
(cherry picked from commit 452e368)

Co-authored-by: Vacha Shah <vachshah@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport 2.0 Backport to 2.0 branch bug Something isn't working Indexing & Search v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants