Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserializing JSON subfields within structs fails #1500

Closed
jonathan-ostrander opened this issue Feb 15, 2023 · 12 comments · Fixed by #1587
Closed

Deserializing JSON subfields within structs fails #1500

jonathan-ostrander opened this issue Feb 15, 2023 · 12 comments · Fixed by #1587
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@jonathan-ostrander
Copy link
Contributor

Environment details

  • OS type and version: MacOS Ventura 13.1
  • Python version: 3.9.5
  • pip version: 22.3.1
  • google-cloud-bigquery version: 3.5.0

Steps to reproduce

  1. Create table with record field containing a JSON subfield
  2. Query record field and try to read row from query result

Code example

from google.cloud import bigquery
bigquery.Client().query("SELECT record_with_json_subfield.* FROM <table>").result().__next__()

Stack trace

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <listcomp>
  File "lib/python3.9/site-packages/google/api_core/page_iterator.py", line 209, in _items_iter
    for item in page:
  File "lib/python3.9/site-packages/google/api_core/page_iterator.py", line 131, in __next__
    result = self._item_to_value(self._parent, item)
  File "lib/python3.9/site-packages/google/cloud/bigquery/table.py", line 2710, in _item_to_row
    _helpers._row_tuple_from_json(resource, iterator.schema),
  File "lib/python3.9/site-packages/google/cloud/bigquery/_helpers.py", line 435, in _row_tuple_from_json
    row_data.append(_field_from_json(cell["v"], field))
  File "lib/python3.9/site-packages/google/cloud/bigquery/_helpers.py", line 411, in _field_from_json
    return converter(resource, field)
  File "lib/python3.9/site-packages/google/cloud/bigquery/_helpers.py", line 368, in _record_from_json
    converter = _CELLDATA_FROM_JSON[subfield.field_type]
KeyError: 'JSON'
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Feb 15, 2023
@jekozyra
Copy link

I just encountered the same issue on 2.34.4.

@chalmerlowe
Copy link
Collaborator

I ran this query using version 3.10.0 with no issues.

@1tux
Copy link

1tux commented May 28, 2023

I believe it happens with nested jsons. I encountered it on version 3.10.0 as well.

@1tux
Copy link

1tux commented May 28, 2023

Here is a hotfix that was working for me:

from google.cloud import bigquery

import json

def _json_from_json(value, _):
    """NOOP string -> string coercion"""
    return json.loads(value)


bigquery._helpers._CELLDATA_FROM_JSON['JSON'] = _json_from_json

@chalmerlowe chalmerlowe reopened this May 30, 2023
@SergeyShk
Copy link

Could you apply this hotfix to the stable version? It seems like to work

@sirkozuch
Copy link

sirkozuch commented Jun 8, 2023

The issue is happening also on earlier versions of the SDK. Tested on 2.20.0 and 2.34.1.

@masahitojp
Copy link

I have also tried with 3.11.3. What I have tried is the following configuration fields

        - name: structarr
          type: STRUCT
          mode: repeated
          fields:
            - name: key
              type: STRING
            - name: value
              type: JSON

I tried running this hotfix. I got the data without any error!
#1500 (comment)

@meredithslota meredithslota added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Aug 15, 2023
tswast added a commit to tomwojcik/python-bigquery that referenced this issue Oct 3, 2023
Linchin added a commit to tomwojcik/python-bigquery that referenced this issue Nov 1, 2023
tswast added a commit to tomwojcik/python-bigquery that referenced this issue Dec 13, 2023
@tswast tswast self-assigned this Dec 13, 2023
@vvkh
Copy link

vvkh commented Dec 14, 2023

@tswast still receiving the error:

$ pip list | grep bigquery
google-cloud-bigquery       3.14.1
  File "/home/datafold/.local/lib/python3.11/site-packages/google/cloud/bigquery/table.py", line 2962, in _item_to_row
    _helpers._row_tuple_from_json(resource, iterator.schema),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/datafold/.local/lib/python3.11/site-packages/google/cloud/bigquery/_helpers.py", line 300, in _row_tuple_from_json
    row_data.append(_field_from_json(cell["v"], field))
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/datafold/.local/lib/python3.11/site-packages/google/cloud/bigquery/_helpers.py", line 276, in _field_from_json
    return converter(resource, field)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/datafold/.local/lib/python3.11/site-packages/google/cloud/bigquery/_helpers.py", line 233, in _record_from_json
    converter = _CELLDATA_FROM_JSON[subfield.field_type]
                ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
KeyError: 'JSON'

Looks like JSON is still missing in _CELLDATA_FROM_JSON

I think #1742 should be merged as well.

@tswast tswast reopened this Dec 14, 2023
@tswast
Copy link
Contributor

tswast commented Dec 14, 2023

#1742 has now been merged.

@tswast tswast closed this as completed Dec 14, 2023
@tswast
Copy link
Contributor

tswast commented Dec 14, 2023

Note: Unfortunately, we're now in a holiday season release freeze, but you can test the change out with pip install --upgrade "git+https://github.com/googleapis/python-bigquery.git#egg=google-cloud-bigquery"

@colin-rogers-dbt
Copy link

@tswast when is the freeze over / is there a schedule for the next release?

@tswast
Copy link
Contributor

tswast commented Jan 9, 2024

@colin-rogers-dbt Thanks for the reminder. I don't see any blockers on #1752 so I'll cut the release now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.