Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery Storage: Add support for arrow format in BQ Read API #8644

Merged
merged 10 commits into from
Jul 11, 2019

Conversation

tswast
Copy link
Contributor

@tswast tswast commented Jul 11, 2019

  • Makes _StreamParser abstract, and breaks it into two implementations: one for arrow and one for avro. The implementation is selected is based on the schema set in the ReadSession.
  • Adds to_arrow to reader classes.

Split out from #8551 so that the changes to google-cloud-bigquery-storage can be submitted and released separately from the changes to google-cloud-bigquery.

@tswast tswast requested a review from a team July 11, 2019 13:46
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jul 11, 2019
@tswast tswast added the api: bigquerystorage Issues related to the BigQuery Storage API. label Jul 11, 2019

_AVRO_BYTES_OPERATION = "parse ReadRowResponse messages with Avro bytes"
_ARROW_BYTES_OPERATION = "parse ReadRowResponse messages with Arrow bytes"
_FASTAVRO_REQUIRED = "fastavro is required to {operation}."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume these parameterized errors are for when you do things like to_arrow with avro bytes?

Copy link
Contributor Author

@tswast tswast Jul 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is vestigial from when I was planning to implement to_arrow for Avro streams. Removed for now.


Args:
read_session (google.cloud.bigquery_storage_v1beta1.types.ReadSession):
A read session. This is required because it contains the schema
used in the stream messages.
"""
if fastavro is None:
raise ImportError(_FASTAVRO_REQUIRED)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be parameterized as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the {operation} from error message since I didn't actually need it.

@tswast tswast merged commit c5a7cd2 into googleapis:master Jul 11, 2019
@tswast tswast deleted the pr8551-only-bqstorage branch July 11, 2019 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquerystorage Issues related to the BigQuery Storage API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants