-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bigquery: Add page_size
parameter to QueryJob.result
.
#8206
Conversation
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
I signed it! |
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
Coverage failure
is unrelated to this change and was fixed in #8179 |
bigquery/tests/unit/test_job.py
Outdated
|
||
result = job.result(page_size=3) | ||
|
||
self.assertEqual(result.total_rows, 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's assert that tabledata.list is called with the right page size (maxResults
) as a parameter.
See:
def test_list_rows(self): |
Specifically, I'd like to see something like:
tabledata_path = "/projects/%s/datasets/%s/tables/%s/data" % (
self.PROJECT,
self.DS_ID,
self.TABLE_ID,
)
conn.api_request.assert_called_once_with(
method="GET", path=tabledata_path, query_params={"maxResults": 3}
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Tim,
Thanks for the feedback but I think I need a hand.
Do you mean that the maxResults should be directly linked to the page_size ? The api documentation seems to say it link but the code don't
google-cloud-python/bigquery/google/cloud/bigquery/client.py
Lines 1949 to 1963 in 0975724
row_iterator = RowIterator( | |
client=self, | |
api_request=functools.partial(self._call_api, retry), | |
path="%s/data" % (table.path,), | |
schema=schema, | |
page_token=page_token, | |
max_results=max_results, | |
page_size=page_size, | |
extra_params=params, | |
table=table, | |
# Pass in selected_fields separately from schema so that full | |
# tables can be fetched without a column filter. | |
selected_fields=selected_fields, | |
) | |
return row_iterator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confusingly, page_size
in Python is used to set maxResults
in the API request, since that's what BigQuery calls its page size parameter.
I've updated your PR to test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I see...
Many thanks for the help on the unit test !
Thanks for the contribution @AzemaBaptiste. This is a great addition. I like the system test. I'd just like to also verify in a unit test that the parameter is actually sent. |
page_size
parameter for result
page_size
parameter to QueryJob.result
.
Great job, thanks 🎉 |
Makes the user control page size for a bigquery job.