-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add missing handler for deserializing json value #1587
fix: add missing handler for deserializing json value #1587
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
@tswast bump, please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
One concern I have is the other direction. If someone supplies a parsed JSON object to an API like insert_rows
, do we correctly serialize that?
Looks like everything is working well except for the type annotations:
|
This change is causing the following error for
Is there a recommended fix for this? |
Thanks @MartijnvanElferen for the report. I've filed #1756 to investigate further. |
As a workaround |
@MartijnvanElferen Fix pending: #1757 Note: this PR will change the behavior of JSON data in |
Thanks, @tswast. I'll refactor a couple of things to avoid the type conversion. Looking forward to the fix at the same time! |
@tswast This change should probably be listed as a potentially braking change in the release notes, since before this change reading the entire column would return a value of type CREATE TABLE mydataset.table(
id INT64,
proto_as_json JSON
); from google.cloud import bigquery
from google.protobuf import json_format
from com.example.my_proto_pb2 import MyProto
client = bigquery.Client()
proto = MyProto(first_name="Jane", last_name="Doe")
proto_as_json = json_format.MessageToJson(
proto, use_integers_for_enums=True, indent=None)
rows_to_insert = [
{"id": 1, "proto_as_json": f'JSON """{proto_as_json}"""'},
]
errors = client.insert_rows("mydataset.table", rows_to_insert)
assert not errors
row = client.query("SELECT proto_as_json FROM mydataset.table").result().__next__()
json_str = row["proto_as_json"]
# Prior to this change, json_str was of type str, so we could do this:
read_proto = MyProto()
json_format.Parse(json_str, read_proto)
assert proto == read_proto Luckily, in our project we created a common helper method for the |
Yeah, it's a gray area with this since we never said we support the JSON type until this change. We still don't support JSON in some code paths like to_dataframe() |
@tswast We saw the breakage when we tried to update a number of python libraries to newer versions. The release notes were very unhelpful in determining which library and version could have caused the problem (because the PR and bug listed in the release notes suggested that it only affected selecting subfields). It wasn't until we found a work around that we were able to track the issue to v3.15.0 and this change. Even if there were no prior promises of support of the JSON type, a short blurb in the release notes would be very much appreciated. |
Fixes #1500