Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scan_type is search & small result set gives SearchContextMissingException #5345

Closed
jillesvangurp opened this issue Mar 5, 2014 · 3 comments

Comments

@jillesvangurp
Copy link
Contributor

may be duplicate of or related to #5170

using elasticsearch 1.0.0

I have a query where I would like to use search_type=scan to scroll through all contacts owned by a user. It works fine if the user has enough contacts but I have one user where there are only two contacts and this fails.

So I do a GET on
http://localhost:9200/users/contact/_search?search_type=scan&scroll=60m&size=100
{
"query": {
"term": {
"userId": {
"value": "1o"
}
}
}
}

I get back the following response
{
"_scroll_id":"c2Nhbjs1OzIwMTpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwNDpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwMzpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwMjpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwNTpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzE7dG90YWxfaGl0czoyOw==",
"took":1,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":2,
"max_score":0.0,
"hits":[

    ]
}

}

and then http://localhost:9200/_search/scroll?scroll=60m
c2Nhbjs1OzIwMTpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwNDpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwMzpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwMjpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzIwNTpxS1hwWW80MFJvV0hwbjdBcm5JRkF3OzE7dG90YWxfaGl0czoyOw==

fails with a SearchContextMissingException

The same query for a user with 30000 contacts works fine and pages through the results like I would expect. The above query returns the two results normally if I query without search_type=scan

So it only fails if the result set is smaller than the page size

@jillesvangurp
Copy link
Contributor Author

This bug may be false but there is an underlying problem that might cause me to see this that I discovered doing some more analysis on this:

I had a bit of code that used search_type that worked fine with 0.90 but broke when I tried it with 1.0.1 (upgraded this afternoon) today in several ways:

problem #1: the exit condition changed.

I used to parse the scrollId from the response and stop fetching new results when it was no longer included. Now I get the following response for the final page of results:

{
"_scroll_id":"c2NhbjswOzE7dG90YWxfaGl0czo3NDk2Ow==",
"took":3,
"timed_out":false,
"_shards":{
"total":5,
"successful":4,
"failed":1,
"failures":[
{
"status":500,
"reason":"SearchContextMissingException[No search context found for id [199]]"
}
]
},
"hits":{
"total":7496,
"max_score":0.0
"hits":[...]

}

}

So it is actually reporting an error for the next page of results from one of the shards and includes the final results. This looks weird to me. The only way I have of deducing that this is the final page is to look at the failures object or to keep track of the number hits I've processed. That can't be right.

problem #2: the size parameter seems to work in a weird way.

I'm not actually sure if that is a change or whether this was always broken. In any case, this seems to be per shard. So if I specify size=100, I actually get back 500 results per page, which would be the number of shards times the size.

So getting back to my original bug, I probably am getting the results but my code fails trying to fetch another page of results because of the broken exit condition.

I would expect either the old behavior where the last result fetched no longer includes the scrollid. Alternatively, the API could be improved by explicitly including a next url and omitting that on the last page. My interpretation of the old behavior was to use the scrollid like this indeed. In any case, asking for the last page should not return any errors from any shard.

@jillesvangurp
Copy link
Contributor Author

actually looking closer it turns out it was a pilot error after all. I was passing in the same scrollid instead of using the one in the request.

The minor issue of the pageSize above may be valid but that's explainable and probably not a big issue. So, closing this one.

@clintongormley
Copy link
Contributor

@jillesvangurp to explain: with the scan search type, there is no reduce phase, so size is actually per-shard, rather than per-request. Each shard returns a max of size results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants