FeedIterator unexpectedly pre-fetches next page #990

justinbhopper · 2019-11-13T04:35:38Z

Describe the bug
The FeedIterator<T> fetches a page of results before it is requested. A single call of ReadNextAsync() causes two POSTs to the cosmos API, rather one.

To Reproduce
Here is a sample application that illustrates the problem. Note ReadNextAsync() is only called once, but the console logs two API calls.

Expected behavior
FeedIterator should only fetch pages upon request.

Actual behavior
FeedIterator seems to be calling for a second page of results on its own. If this is by design, then I think it is flawed, as it makes a huge assumption that we care to get more than the number of items requested.

Environment summary
SDK Version: 3.4.1
OS Version: Windows 10

Additional context
Here is a sample of the console's output. Note that two POSTs occur, despite only having called ReadNextAsync() once. Also note that the contents of the second POST show the next set of results (items 5-9).

POST dbs/FeedTest/colls/Test
{"_rid":"dIVbALcqJUk=","Documents":[
  {"r":{"id":"3cc75122-20d1-4d14-ba41-c1f328d2e9f3","Index":0,"Partition":"test","_rid":"dIVbALcqJUkBAAAAAAAAAA==","_self":"dbs\/dIVbAA==\/colls\/dIVbALcqJUk=\/docs\/dIVbALcqJUkBAAAAAAAAAA==\/","_etag":"\"00000000-0000-0000-99da-8ff2d1b101d5\"","_attachments":"attachments\/","_ts":1573619207}},
  {"r":{"id":"015aa151-203a-4b85-9c7b-bd99ade327e6","Index":1,"Partition":"test","_rid":"dIVbALcqJUkCAAAAAAAAAA==","_self":"dbs\/dIVbAA==\/colls\/dIVbALcqJUk=\/docs\/dIVbALcqJUkCAAAAAAAAAA==\/","_etag":"\"00000000-0000-0000-99da-8ff4ecf101d5\"","_attachments":"attachments\/","_ts":1573619207}},
  {"r":{"id":"b74259ec-ce80-4881-931c-39355e95e23a","Index":2,"Partition":"test","_rid":"dIVbALcqJUkDAAAAAAAAAA==","_self":"dbs\/dIVbAA==\/colls\/dIVbALcqJUk=\/docs\/dIVbALcqJUkDAAAAAAAAAA==\/","_etag":"\"00000000-0000-0000-99da-8ff6345a01d5\"","_attachments":"attachments\/","_ts":1573619207}},
  {"r":{"id":"b2d53634-923c-4a54-a9f2-09ff3a4e1c97","Index":3,"Partition":"test","_rid":"dIVbALcqJUkEAAAAAAAAAA==","_self":"dbs\/dIVbAA==\/colls\/dIVbALcqJUk=\/docs\/dIVbALcqJUkEAAAAAAAAAA==\/","_etag":"\"00000000-0000-0000-99da-8ff699df01d5\"","_attachments":"attachments\/","_ts":1573619207}},
  {"r":{"id":"9555e4d7-e75f-47e2-83d3-84c1b85757bf","Index":4,"Partition":"test","_rid":"dIVbALcqJUkFAAAAAAAAAA==","_self":"dbs\/dIVbAA==\/colls\/dIVbALcqJUk=\/docs\/dIVbALcqJUkFAAAAAAAAAA==\/","_etag":"\"00000000-0000-0000-99da-8ff6dd4901d5\"","_attachments":"attachments\/","_ts":1573619207}}
],"_count":5}

POST dbs/FeedTest/colls/Test
{"_rid":"dIVbALcqJUk=","Documents":[
  {"r":{"id":"1b51b134-ef3c-4f61-b42b-486391bc2e7e","Index":5,"Partition":"test","_rid":"dIVbALcqJUkGAAAAAAAAAA==","_self":"dbs\/dIVbAA==\/colls\/dIVbALcqJUk=\/docs\/dIVbALcqJUkGAAAAAAAAAA==\/","_etag":"\"00000000-0000-0000-99da-8ff7320701d5\"","_attachments":"attachments\/","_ts":1573619207}},
  {"r":{"id":"a4d7a76d-683d-46ec-94c1-c820a3572fdf","Index":6,"Partition":"test","_rid":"dIVbALcqJUkHAAAAAAAAAA==","_self":"dbs\/dIVbAA==\/colls\/dIVbALcqJUk=\/docs\/dIVbALcqJUkHAAAAAAAAAA==\/","_etag":"\"00000000-0000-0000-99da-8ff77f7201d5\"","_attachments":"attachments\/","_ts":1573619207}},
  {"r":{"id":"189306b0-aae2-4b4f-9ff8-e0b566943fe4","Index":7,"Partition":"test","_rid":"dIVbALcqJUkIAAAAAAAAAA==","_self":"dbs\/dIVbAA==\/colls\/dIVbALcqJUk=\/docs\/dIVbALcqJUkIAAAAAAAAAA==\/","_etag":"\"00000000-0000-0000-99da-8ff7cf4e01d5\"","_attachments":"attachments\/","_ts":1573619207}},
  {"r":{"id":"348cf465-de43-420e-9c30-11116dc4a755","Index":8,"Partition":"test","_rid":"dIVbALcqJUkJAAAAAAAAAA==","_self":"dbs\/dIVbAA==\/colls\/dIVbALcqJUk=\/docs\/dIVbALcqJUkJAAAAAAAAAA==\/","_etag":"\"00000000-0000-0000-99da-8ff81b2101d5\"","_attachments":"attachments\/","_ts":1573619207}},
  {"r":{"id":"297d734e-ba44-409b-8c0c-95c44aa13983","Index":9,"Partition":"test","_rid":"dIVbALcqJUkKAAAAAAAAAA==","_self":"dbs\/dIVbAA==\/colls\/dIVbALcqJUk=\/docs\/dIVbALcqJUkKAAAAAAAAAA==\/","_etag":"\"00000000-0000-0000-99da-8ff8684101d5\"","_attachments":"attachments\/","_ts":1573619207}}
],"_count":5}

The text was updated successfully, but these errors were encountered:

j82w · 2019-11-13T13:17:57Z

This is an optimization in the query code. The query code will buffer pages in parallel in the background to reduce the latency to the caller. You can set the MaxBufferedItemCount to 0 and the MaxConcurrency to 0 in the QueryRequestOptions to prevent this.

bartelink · 2019-11-13T14:18:04Z

I am observing a not dissimilar change in behavior in the 2.2 sdk in recent times...

I have 4 items and ask for max 2 items but get a response with 4 items (versus expecting to only get them when I ask for next page of results)

Is there a recent change in server behavior, or (as I assume is the case), is this absolutely 100% a client sdk thing?

UPDATE: Confirming that with 2.0 and 2.2 I am seeing a failing test that's not failed before in the test suite in https://github.com/jet/equinox
Assertion condition failing:

 [ResponseBackward; ResponseBackward; QueryBackward; Append] = [ResponseBackward; QueryBackward; Append]

^ this shows I get one response instead of two

I cannot 100% confirm that this (presumably server behavior) server change started affecting V2 clients at the same time, but I think I went and sanity checked at the time.

Bottom line is that I don't have a way with either the v2 or v3 SDKs to get behavior I'm seeking (have not analyzed the request flow but it seems pretty clear to me that it's not likely to be that sort of a problem in my case, whereas the OP evidently is)

justinbhopper · 2019-11-13T14:23:24Z

@j82w I have set MaxBufferedItemCount to 0 and MaxConcurrency to 0, but the output remains the same. Please see my updated sample application to confirm I did this correctly.

Can we please reopen this issue?

bartelink · 2019-11-13T14:25:32Z

Note I have also been observing this (slightly different) behavior on the V3 SDK for some time (and, see above am also seeing evidence of the same behavior change against the V2 SDK). (NB not saying the conditions are the same, but definitely interested in resolving that issue too; who knows it might clarify matters wrt this too)

j82w · 2019-11-13T15:00:56Z

@bchong95 is this by design?

j82w · 2019-11-13T19:29:08Z

This seems to be a bug in the way query is buffering the pages. Assigned to @bchong95 for the fix.

j82w · 2019-11-13T19:55:30Z

The bug is in the CosmosParallelItemQueryExecutionContext. The last iteration on the for loop causes it to move next which causes it to load another page. There is some other higher priority bugs being worked on currently so there is no eta when this will get fixed.

justinbhopper · 2019-11-13T20:00:50Z

Not to question your team's priorities, but I want to note that this is causing all of our queries to double in RU cost because every query we make is requesting twice as many documents as was asked for.

This may very well explain many of the reports in other issues here where SDK v3 has worse performance than v2.

j82w · 2019-11-13T20:26:59Z

This bug is a priority to get fixed. There is currently another bug where query is completely broken and is blocking a customer so it is taking priority. This bug where an extra page load is bad, but it doesn't impact any user draining the entire query. This should get prioritized next. If you want the fix faster please feel free to submit a PR.

justinbhopper · 2019-11-13T20:31:20Z

I apologize for my wording - I was trying to add impact info without seeming to suggest that this bug's priority be increased. The wording I chose didn't achieve that. Of course performance issues would not be prioritized higher than functionality issues.

j82w · 2019-11-13T20:52:00Z

No need to apologize. I should have been more clear that we had functional bug that is taking priority, and that this will get prioritized next. I'm just trying to be as transparent as possible with the reasoning behind the prioritization of fixes. We really appreciate the detailed bug report with the sample. We'll try to get the issue fixed as soon as we can.

bchong95 · 2019-11-17T06:39:03Z

Resolved with #999

justinbhopper · 2019-12-09T19:05:34Z

@j82w Can we please reopen this ticket? #999 fixes the issue unless queries contain an ORDER BY.

If the query contains an ORDER BY, the prefetching occurs even if MaxBufferedItemsCount and MaxConcurrency are set to 0.

Updated my example to illustrate this:
https://github.com/justinbhopper/cosmos-feed-issue/blob/master/Program.cs#L44

j82w · 2019-12-09T20:13:27Z

This is by design and required for order by queries to give the correct results. Order by queries do the the following in the SDK. The SDK has to go to every partition to get an ordered list, then do a k-way merge on the results, then it takes the top N from that merged list and returns it as the result. Each partition only knows about content in it's own partition so there is no way to know which partition has the Nth item in the container for the order by.

stephenwilson11 · 2020-02-26T13:22:38Z

@j82w, Sorry I do not follow your reason for why this is required.

What is the second request resolving? From the above and inspecting the responses & continuation token returned from the sdk it appears to be done to resolve the 'rid' and order by values of the next item for use in the continuation token?

If so could this not be done more efficiently by the sdk/cosmos requesting 'MaxItemCount' + 1 so the required info can be provided in the single request. Nothing else from the second request appears to be used??

Sorry if I simply do not get the issue here, I’m pretty new to using Cosmos but it does appear to be pretty wasteful & costly as is.

vit-svoboda · 2020-03-20T17:19:54Z

Why does the v3 SKD insist on treating all queries as cross-partition ones even though the partition key is provided? From what I understand, cross-partition queries are to be avoided for performance reasons.

j82w · 2020-03-20T17:43:29Z

@bchong95 can you explain?

bartelink · 2020-03-20T17:58:38Z

(For the record I'm also sitting on a fully ported branch which is doing pretty innocuous stuff but for unexplained reasons we're seeing a ~10% additional RU cost and lack of anticipated perf benefits - i.e. the stuff I previously outlined on this thread appears to be panning out as being indicative of over-reading beyond just what one might expect solely from a low MaxItems count not being honored. I'll supply a more detailed analysis in due course)

eta: 3.7.0-preview2 does not fix the issue, and neither does the custom build in the other issue

bchong95 · 2020-05-16T01:12:27Z

for ORDER BY queries you can now set a partition key to avoid this behavior:

#1319

bartelink · 2020-05-17T14:29:29Z

eta: 3.7.0-preview2 does not fix the issue, and neither does the custom build in the other issue

🎉 3.9.0-preview fixes my problem - reducing RU consumption back to levels as per V2 (using preview-3 now which is also healthy).

Any rough eta on when we'll see a 4.x preview with this fix included?

j82w · 2020-05-18T13:12:04Z

@ealsur is there any estimate on when this fix will get ported to v4?

ealsur · 2020-05-18T15:32:24Z

@j82w No, V4 focus at this point is public surface for review board.

j82w closed this as completed Nov 13, 2019

j82w reopened this Nov 13, 2019

j82w assigned bchong95 Nov 13, 2019

j82w added the QUERY label Nov 13, 2019

j82w closed this as completed Nov 20, 2019

stephenwilson11 mentioned this issue Mar 17, 2020

Using continuation token with new QueryIterator results in high RU & latency. #1282

Closed

j82w reopened this Mar 20, 2020

bchong95 closed this as completed May 16, 2020

This was referenced May 17, 2020

Ported query consuming more RUs in V3 ? #722

Closed

Reinstate CheckpointAsync #616

Closed

justinbhopper mentioned this issue Jun 23, 2022

Query: MaxBufferedItemCount is not being honored #2808

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FeedIterator unexpectedly pre-fetches next page #990

FeedIterator unexpectedly pre-fetches next page #990

justinbhopper commented Nov 13, 2019

j82w commented Nov 13, 2019 •

edited

Loading

bartelink commented Nov 13, 2019 •

edited

Loading

justinbhopper commented Nov 13, 2019

bartelink commented Nov 13, 2019 •

edited

Loading

j82w commented Nov 13, 2019

j82w commented Nov 13, 2019

j82w commented Nov 13, 2019

justinbhopper commented Nov 13, 2019

j82w commented Nov 13, 2019

justinbhopper commented Nov 13, 2019

j82w commented Nov 13, 2019

bchong95 commented Nov 17, 2019

justinbhopper commented Dec 9, 2019

j82w commented Dec 9, 2019

stephenwilson11 commented Feb 26, 2020 •

edited

Loading

vit-svoboda commented Mar 20, 2020

j82w commented Mar 20, 2020

bartelink commented Mar 20, 2020 •

edited

Loading

bchong95 commented May 16, 2020

bartelink commented May 17, 2020

j82w commented May 18, 2020

ealsur commented May 18, 2020

FeedIterator unexpectedly pre-fetches next page #990

FeedIterator unexpectedly pre-fetches next page #990

Comments

justinbhopper commented Nov 13, 2019

j82w commented Nov 13, 2019 • edited Loading

bartelink commented Nov 13, 2019 • edited Loading

justinbhopper commented Nov 13, 2019

bartelink commented Nov 13, 2019 • edited Loading

j82w commented Nov 13, 2019

j82w commented Nov 13, 2019

j82w commented Nov 13, 2019

justinbhopper commented Nov 13, 2019

j82w commented Nov 13, 2019

justinbhopper commented Nov 13, 2019

j82w commented Nov 13, 2019

bchong95 commented Nov 17, 2019

justinbhopper commented Dec 9, 2019

j82w commented Dec 9, 2019

stephenwilson11 commented Feb 26, 2020 • edited Loading

vit-svoboda commented Mar 20, 2020

j82w commented Mar 20, 2020

bartelink commented Mar 20, 2020 • edited Loading

bchong95 commented May 16, 2020

bartelink commented May 17, 2020

j82w commented May 18, 2020

ealsur commented May 18, 2020

j82w commented Nov 13, 2019 •

edited

Loading

bartelink commented Nov 13, 2019 •

edited

Loading

bartelink commented Nov 13, 2019 •

edited

Loading

stephenwilson11 commented Feb 26, 2020 •

edited

Loading

bartelink commented Mar 20, 2020 •

edited

Loading