Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: TypeError: from_arrays() takes at least 2 positional arguments (1 given) #49

Closed
georghildebrand opened this issue Feb 27, 2020 · 6 comments · Fixed by #51
Closed
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: question Request for information or clarification. Not an issue.

Comments

@georghildebrand
Copy link

Hi all, i tried bq client in python with the default example. Since i moved from 1.23.1 to 1.24.0 last week i get the following issue.

Its related to pyarrow but i was not upgrading pyarrow (worked with it before)

Environment details

  • Python 3.7.6

  • bigquery.version '1.24.0'

  • pyarrow.version '0.11.1'

  • Linux jupyter-generic 4.15.0-1057-aws Start tagging for release versions. google-cloud-python#59-Ubuntu SMP Wed Dec 4 10:02:00 UTC 2019 x86_64

  • x86_64 x86_64 GNU/Linux

  • Name: google-cloud-bigquery

  • Version: 1.24.0

  • Summary: Google BigQuery API client library

  • Location: /opt/conda/lib/python3.7/site-packages

  • Requires: google-cloud-core, google-auth, six, google-resumable-media, protobuf, google-api-core

  • Required-by: pandas-gbq

Steps to reproduce

just running a default example form the webhttps://cloud.google.com/bigquery/docs/bigquery-storage-python-pandas

import google.auth
from google.cloud import bigquery
client = bigquery.Client.from_service_account_json('cred.json')

# Download query results.
query_string = """
SELECT
CONCAT(
    'https://stackoverflow.com/questions/',
    CAST(id as STRING)) as url,
view_count
FROM `bigquery-public-data.stackoverflow.posts_questions`
WHERE tags like '%google-bigquery%'
ORDER BY view_count DESC
"""

dataframe = (
    client.query(query_string)
    .result()
    .to_dataframe()
)
print(dataframe.head())

Stack trace

--------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-61d06599dbdd> in <module>
     12 
     13 dataframe = (
---> 14     client.query(query_string)
     15     .result()
     16     .to_dataframe()

/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client)
   1727                 progress_bar_type=progress_bar_type,
   1728                 bqstorage_client=bqstorage_client,
-> 1729                 create_bqstorage_client=create_bqstorage_client,
   1730             )
   1731             df = record_batch.to_pandas()

/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_arrow(self, progress_bar_type, bqstorage_client, create_bqstorage_client)
   1541             record_batches = []
   1542             for record_batch in self._to_arrow_iterable(
-> 1543                 bqstorage_client=bqstorage_client
   1544             ):
   1545                 record_batches.append(record_batch)

/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in _to_page_iterable(self, bqstorage_download, tabledata_list_download, bqstorage_client)
   1433             )
   1434         )
-> 1435         for item in tabledata_list_download():
   1436             yield item
   1437 

/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/_pandas_helpers.py in download_arrow_tabledata_list(pages, bq_schema)
    523 
    524     for page in pages:
--> 525         yield _tabledata_list_page_to_arrow(page, column_names, arrow_types)
    526 
    527 

/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/_pandas_helpers.py in _tabledata_list_page_to_arrow(page, column_names, arrow_types)
    499 
    500     if isinstance(column_names, pyarrow.Schema):
--> 501         return pyarrow.RecordBatch.from_arrays(arrays, schema=column_names)
    502     return pyarrow.RecordBatch.from_arrays(arrays, names=column_names)
    503 

/opt/conda/lib/python3.7/site-packages/pyarrow/table.pxi in pyarrow.lib.RecordBatch.from_arrays()

TypeError: from_arrays() takes at least 2 positional arguments (1 given)

@busunkim96 busunkim96 transferred this issue from googleapis/google-cloud-python Feb 27, 2020
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Feb 27, 2020
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Feb 28, 2020
@HemangChothani
Copy link
Contributor

@georghildebrand Yes, you are right it's just because of older version of pyarrow . You need to install updated version of pyarrow , minimum required version is 0.15.0 for bigquery 1.24.0

@HemangChothani
Copy link
Contributor

@plamut @tswast Can we specify a minimum required version of pyarrow somewhere (in setup.py file) for the updated version of bigquery

@HemangChothani HemangChothani added type: question Request for information or clarification. Not an issue. and removed triage me I really want to be triaged. labels Feb 28, 2020
@HemangChothani HemangChothani self-assigned this Feb 28, 2020
@plamut
Copy link
Contributor

plamut commented Feb 28, 2020

@HemangChothani That would make sense. Is pyarrow 1.15.0 the first version that works fine? Does everything work in 1.16.0, too?

Asking because 1.16.0 is the first version compatible with Python 3.8 and there is already a PR that will declare full Python 3.8 support. If you can, please check if your findings are consistent with mine, and that pyarrow should actually be bumped to 1.16.0, thanks!

@georghildebrand
Copy link
Author

georghildebrand commented Feb 28, 2020

@HemangChothani , thank you. i confirm upgrading pyarrow is solving the issue!

+1 for updating the dependency in the setup.

@HemangChothani
Copy link
Contributor

@plamut Hmm, 0.15.0 isn't compatible with Python 3.8 so I bump the minimum required version to 0.16.0

@stiebels
Copy link

stiebels commented Nov 24, 2020

Had the same issue with:
python==3.7.0
pyarrow==2.0.0
google-cloud-bigquery==2.4.0

Downgrading pyarrow to 1.0.1 solved it for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants