-
Notifications
You must be signed in to change notification settings - Fork 692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dtype_backend overrides categories #2700
Comments
The wr.athena.read_sql_query API has a If you wish to override these defaults, to remove data = wr.athena.read_sql_query(
sql="SELECT id, options FROM my_table",
database="my-database",
categories=["options"],
pyarrow_additional_kwargs={'types_mapper': None},
) |
@jaidisido While that is a nice suggestion, it also does not work because of how aws-sdk-pandas/awswrangler/athena/_read.py Lines 150 to 153 in 4816e5e
For it to work correctly you need to pass categories as additional kwargs as well: data = wr.athena.read_sql_query(
sql="SELECT id, options FROM my_table",
database="my-database",
pyarrow_additional_kwargs={'types_mapper': None, 'categories': ['options']},
) I'm not sure if the behaviour in |
I can't think of a reason why it's setup that way so I believe it's just badly indented. #2701 should fix that |
Describe the bug
I noticed that when using
wr.athena.read_sql_query
, thecategories
parameter was not having any effect on the returned pandas dataframe.I did some investigation and realised that in the
pa.Table.to_pandas
method, the categorical conversion happens before thetypes_mapper
is taken into account, so in effect, the categorical columns are always being converted back to string.Removing the
types_mapper
kwarg, the categorical types are processed correctly.How to Reproduce
The
categories
will be a string column.Expected behavior
The columns specified in
categories
should bepd.Categorical
typesYour project
No response
Screenshots
No response
OS
MacOS 14.3.1
Python version
3.9.18
AWS SDK for pandas version
3.6.0
Additional context
PyArrow is version 15.0.0
The text was updated successfully, but these errors were encountered: