Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterating BigQuery query results produces incorrect number of rows #506

Closed
polleyg opened this issue Feb 8, 2017 · 2 comments
Closed

Iterating BigQuery query results produces incorrect number of rows #506

polleyg opened this issue Feb 8, 2017 · 2 comments

Comments

@polleyg
Copy link

polleyg commented Feb 8, 2017

This is in relation to post on Stack Overflow post here.

As per the sample code here, a user should be be able to paginate the results of a BigQuery query using the following code structure:

while (result != null) {
      Iterator<List<FieldValue>> iter = result.iterateAll();
      while (iter.hasNext()) {
        List<FieldValue> row = iter.next();
        //do something with row/data
        System.out.println(row);
      }
      result = result.getNextPage();
    }
  }

However, when the result set is large (in my tests >31,000 rows), more rows are returned/iterated even though calling result.getTotalRows() returns the correct/expected number of rows.

For example, I have a table with 85,250 rows (9.45 MB). When I query it via the Java API, and use the code from the example above, it actually iterates 160,296 times.

Even if I limit the result set in the query using limit 5000, and set setPageSize(1000L), then it iterates 15,000 times e.g:

QueryRequest queryRequest = QueryRequest
                .newBuilder("SELECT * FROM [<my-project-id>:<dataset>.<table_with_85250_rows>] limit 5000")
                .setUseLegacySql(true)
                .setPageSize(1000L)
                .build();
        QueryResponse response = bigQuery.query(queryRequest);
        QueryResult result = response.getResult();
        System.out.println("Total rows: " + result.getTotalRows());
        Integer rowNumber = 1;
        while(result != null){
            Iterator<List<FieldValue>> iter = result.iterateAll();
            while(iter.hasNext()){
                List<FieldValue> row = iter.next();
                System.out.println("Row: " + rowNumber + ", with number of columns: " + row.size());
                rowNumber++;
            }
            result = result.getNextPage();
        }

Output:

Total rows: 5000
Row: 1, with number of columns: 11
Row: 2, with number of columns: 11
Row: 3, with number of columns: 11
Row: 4, with number of columns: 11
Row: 5, with number of columns: 11
Row: 6, with number of columns: 11
Row: 7, with number of columns: 11
Row: 8, with number of columns: 11
Row: 9, with number of columns: 11
Row: 10, with number of columns: 11
Row: 11, with number of columns: 11
[.....]
Row: 14997, with number of columns: 11
Row: 14998, with number of columns: 11
Row: 14999, with number of columns: 11
Row: 15000, with number of columns: 11
@kamakay
Copy link

kamakay commented Feb 8, 2017

Hi, I ve found the issue when querying BQ using "nested SQL queries" and legacy SQL. WEb interface UI states the amount N of rows retrieved that equals the number returned from
result.getTotalRows()

@polleyg
Copy link
Author

polleyg commented Feb 10, 2017

Moved this issues to the actual client lib issues - googleapis/google-cloud-java#1612

@polleyg polleyg closed this as completed Feb 10, 2017
ivanmkc pushed a commit that referenced this issue Nov 3, 2022
averikitsch pushed a commit that referenced this issue Nov 9, 2022
minherz pushed a commit that referenced this issue Nov 9, 2022
🤖 I have created a release \*beep\* \*boop\*
---
### [0.120.24](https://www.github.com/googleapis/java-errorreporting/compare/v0.120.23...v0.120.24) (2021-04-26)


### Bug Fixes

* release scripts from issuing overlapping phases ([#506](https://www.github.com/googleapis/java-errorreporting/issues/506)) ([1cdb1b2](https://www.github.com/googleapis/java-errorreporting/commit/1cdb1b2acb0badcf7968b6ed8022d606c784284d))
* typo ([#502](https://www.github.com/googleapis/java-errorreporting/issues/502)) ([f500e1b](https://www.github.com/googleapis/java-errorreporting/commit/f500e1b5fd733cf3348c0970bc31fc06e930852e))


### Dependencies

* update dependency com.google.cloud:google-cloud-core to v1.94.8 ([#513](https://www.github.com/googleapis/java-errorreporting/issues/513)) ([380fbf6](https://www.github.com/googleapis/java-errorreporting/commit/380fbf6c114b78996e083632bc2d92ed0a0ec4e5))
* update dependency com.google.cloud:google-cloud-shared-dependencies to v0.21.1 ([#509](https://www.github.com/googleapis/java-errorreporting/issues/509)) ([16ba2bc](https://www.github.com/googleapis/java-errorreporting/commit/16ba2bca1832afcf8c258dd5f78f7320cdc7d163))
* update dependency com.google.cloud:google-cloud-shared-dependencies to v1 ([#514](https://www.github.com/googleapis/java-errorreporting/issues/514)) ([3b5d01d](https://www.github.com/googleapis/java-errorreporting/commit/3b5d01d09285f6545ae451403d426473fcd97dd7))
---


This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
minherz pushed a commit that referenced this issue Nov 10, 2022
🤖 I have created a release \*beep\* \*boop\*
---
### [0.120.24](https://www.github.com/googleapis/java-errorreporting/compare/v0.120.23...v0.120.24) (2021-04-26)


### Bug Fixes

* release scripts from issuing overlapping phases ([#506](https://www.github.com/googleapis/java-errorreporting/issues/506)) ([1cdb1b2](https://www.github.com/googleapis/java-errorreporting/commit/1cdb1b2acb0badcf7968b6ed8022d606c784284d))
* typo ([#502](https://www.github.com/googleapis/java-errorreporting/issues/502)) ([f500e1b](https://www.github.com/googleapis/java-errorreporting/commit/f500e1b5fd733cf3348c0970bc31fc06e930852e))


### Dependencies

* update dependency com.google.cloud:google-cloud-core to v1.94.8 ([#513](https://www.github.com/googleapis/java-errorreporting/issues/513)) ([380fbf6](https://www.github.com/googleapis/java-errorreporting/commit/380fbf6c114b78996e083632bc2d92ed0a0ec4e5))
* update dependency com.google.cloud:google-cloud-shared-dependencies to v0.21.1 ([#509](https://www.github.com/googleapis/java-errorreporting/issues/509)) ([16ba2bc](https://www.github.com/googleapis/java-errorreporting/commit/16ba2bca1832afcf8c258dd5f78f7320cdc7d163))
* update dependency com.google.cloud:google-cloud-shared-dependencies to v1 ([#514](https://www.github.com/googleapis/java-errorreporting/issues/514)) ([3b5d01d](https://www.github.com/googleapis/java-errorreporting/commit/3b5d01d09285f6545ae451403d426473fcd97dd7))
---


This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
minherz pushed a commit that referenced this issue Nov 10, 2022
🤖 I have created a release \*beep\* \*boop\*
---
### [0.120.24](https://www.github.com/googleapis/java-errorreporting/compare/v0.120.23...v0.120.24) (2021-04-26)


### Bug Fixes

* release scripts from issuing overlapping phases ([#506](https://www.github.com/googleapis/java-errorreporting/issues/506)) ([1cdb1b2](https://www.github.com/googleapis/java-errorreporting/commit/1cdb1b2acb0badcf7968b6ed8022d606c784284d))
* typo ([#502](https://www.github.com/googleapis/java-errorreporting/issues/502)) ([f500e1b](https://www.github.com/googleapis/java-errorreporting/commit/f500e1b5fd733cf3348c0970bc31fc06e930852e))


### Dependencies

* update dependency com.google.cloud:google-cloud-core to v1.94.8 ([#513](https://www.github.com/googleapis/java-errorreporting/issues/513)) ([380fbf6](https://www.github.com/googleapis/java-errorreporting/commit/380fbf6c114b78996e083632bc2d92ed0a0ec4e5))
* update dependency com.google.cloud:google-cloud-shared-dependencies to v0.21.1 ([#509](https://www.github.com/googleapis/java-errorreporting/issues/509)) ([16ba2bc](https://www.github.com/googleapis/java-errorreporting/commit/16ba2bca1832afcf8c258dd5f78f7320cdc7d163))
* update dependency com.google.cloud:google-cloud-shared-dependencies to v1 ([#514](https://www.github.com/googleapis/java-errorreporting/issues/514)) ([3b5d01d](https://www.github.com/googleapis/java-errorreporting/commit/3b5d01d09285f6545ae451403d426473fcd97dd7))
---


This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Shabirmean pushed a commit that referenced this issue Nov 10, 2022
…duct from gcs. (#506)

* Refactoring: added invalid catalog name.

* PR fix: refactoring

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* PR fix: refactoring.

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
bourgeoisor pushed a commit that referenced this issue Nov 11, 2022
…506)

[![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [com.google.cloud:libraries-bom](https://github.com/GoogleCloudPlatform/cloud-opensource-java) | `19.2.1` -> `20.0.0` | [![age](https://badges.renovateapi.com/packages/maven/com.google.cloud:libraries-bom/20.0.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/maven/com.google.cloud:libraries-bom/20.0.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/maven/com.google.cloud:libraries-bom/20.0.0/compatibility-slim/19.2.1)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/maven/com.google.cloud:libraries-bom/20.0.0/confidence-slim/19.2.1)](https://docs.renovatebot.com/merge-confidence/) |

---

### Configuration

:date: **Schedule**: At any time (no schedule defined).

:vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

:recycle: **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

:no_bell: **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box.

---

This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/java-dataproc).
Shabirmean pushed a commit that referenced this issue Nov 15, 2022
…duct from gcs. (#506)

* Refactoring: added invalid catalog name.

* PR fix: refactoring

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* PR fix: refactoring.

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
Shabirmean pushed a commit that referenced this issue Nov 15, 2022
Shabirmean pushed a commit that referenced this issue Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants