Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Use pandas custom data types for BigQuery DATE and TIME columns, remove date_as_object argument #972

Merged
merged 87 commits into from
Nov 10, 2021

Conversation

jimfulton
Copy link
Contributor

@jimfulton jimfulton commented Sep 15, 2021

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixed #861
Fixes #862 🦕

BREAKING CHANGE: remove date_as_object argument from to_dataframe. The dbdate dtype is used by default with an automatic fallback to object when dates are not within the range of a nanosecond-precision pandas timestamp

So we can force use of the default/backup NDArrayBackedExtensionArray
for pandas earlier than 1.3.

Also simplified version-checking code.
@tswast tswast added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 9, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 9, 2021
@tswast tswast added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 9, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 9, 2021
Copy link
Contributor

@leahecole leahecole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partway through looking but did want to get this one small change in fairly early

tests/unit/job/test_query_pandas.py Outdated Show resolved Hide resolved
tests/unit/job/test_query_pandas.py Outdated Show resolved Hide resolved
tswast and others added 2 commits November 9, 2021 16:40
Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com>
Co-authored-by: Leah E. Cole <6719667+leahecole@users.noreply.github.com>
@tswast tswast added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 9, 2021
@leahecole leahecole dismissed their stale review November 9, 2021 22:41

changes addressed

@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 9, 2021
Copy link
Contributor

@leahecole leahecole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but I don't know if I'm the most qualified reviewer for this - I didn't see anything that looked glaringly like it would break and the small changes I requested were addressed so I'm happy 🤷🏻‍♀️

@tswast
Copy link
Contributor

tswast commented Nov 9, 2021

Thanks @leahecole ! I'll let Kokoro run and check back on this PR with fresh eyes tomorrow before merging.

@tswast tswast added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 10, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 10, 2021
@tswast tswast added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 10, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 10, 2021
@tswast tswast merged commit 3d1af95 into googleapis:v3 Nov 10, 2021
tswast added a commit that referenced this pull request Mar 29, 2022
deps!: BigQuery Storage and pyarrow are required dependencies (#776)

fix!: use nullable `Int64` and `boolean` dtypes in `to_dataframe` (#786) 

feat!: destination tables are no-longer removed by `create_job` (#891)

feat!: In `to_dataframe`, use `dbdate` and `dbtime` dtypes from db-dtypes package for BigQuery DATE and TIME columns (#972)

fix!: automatically convert out-of-bounds dates in `to_dataframe`, remove `date_as_object` argument (#972)

feat!: mark the package as type-checked (#1058)

feat!: default to DATETIME type when loading timezone-naive datetimes from Pandas (#1061)

feat: add `api_method` parameter to `Client.query` to select `INSERT` or `QUERY` API (#967)

fix: improve type annotations for mypy validation (#1081)

feat: use `StandardSqlField` class for `Model.feature_columns` and `Model.label_columns` (#1117)

docs: Add migration guide from version 2.x to 3.x (#1027)

Release-As: 3.0.0
waltaskew pushed a commit to waltaskew/python-bigquery that referenced this pull request Jul 20, 2022
deps!: BigQuery Storage and pyarrow are required dependencies (googleapis#776)

fix!: use nullable `Int64` and `boolean` dtypes in `to_dataframe` (googleapis#786) 

feat!: destination tables are no-longer removed by `create_job` (googleapis#891)

feat!: In `to_dataframe`, use `dbdate` and `dbtime` dtypes from db-dtypes package for BigQuery DATE and TIME columns (googleapis#972)

fix!: automatically convert out-of-bounds dates in `to_dataframe`, remove `date_as_object` argument (googleapis#972)

feat!: mark the package as type-checked (googleapis#1058)

feat!: default to DATETIME type when loading timezone-naive datetimes from Pandas (googleapis#1061)

feat: add `api_method` parameter to `Client.query` to select `INSERT` or `QUERY` API (googleapis#967)

fix: improve type annotations for mypy validation (googleapis#1081)

feat: use `StandardSqlField` class for `Model.feature_columns` and `Model.label_columns` (googleapis#1117)

docs: Add migration guide from version 2.x to 3.x (googleapis#1027)

Release-As: 3.0.0
abdelmegahedgoogle pushed a commit to abdelmegahedgoogle/python-bigquery that referenced this pull request Apr 17, 2023
deps!: BigQuery Storage and pyarrow are required dependencies (googleapis#776)

fix!: use nullable `Int64` and `boolean` dtypes in `to_dataframe` (googleapis#786) 

feat!: destination tables are no-longer removed by `create_job` (googleapis#891)

feat!: In `to_dataframe`, use `dbdate` and `dbtime` dtypes from db-dtypes package for BigQuery DATE and TIME columns (googleapis#972)

fix!: automatically convert out-of-bounds dates in `to_dataframe`, remove `date_as_object` argument (googleapis#972)

feat!: mark the package as type-checked (googleapis#1058)

feat!: default to DATETIME type when loading timezone-naive datetimes from Pandas (googleapis#1061)

feat: add `api_method` parameter to `Client.query` to select `INSERT` or `QUERY` API (googleapis#967)

fix: improve type annotations for mypy validation (googleapis#1081)

feat: use `StandardSqlField` class for `Model.feature_columns` and `Model.label_columns` (googleapis#1117)

docs: Add migration guide from version 2.x to 3.x (googleapis#1027)

Release-As: 3.0.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants