Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV Read Fails on dtypes.int32 After Recent int_ Removal #5601

Closed
stanbrub opened this issue Jun 11, 2024 · 3 comments · Fixed by #5602
Closed

CSV Read Fails on dtypes.int32 After Recent int_ Removal #5601

stanbrub opened this issue Jun 11, 2024 · 3 comments · Fixed by #5602
Labels
bug Something isn't working triage

Comments

@stanbrub
Copy link
Contributor

The recent removal of dtypes.int_ caused an error in a benchmark. When switching to dtypes.int32 instead, the benchmark still fails but with a different error. I've attached the stack trace.

It looks like the CSV read does not support all the scalar dtypes. I'm looking for a true 32 bit int, not a 64 bit one, for my test. (Though I'm not sure that's doable from the python side. Maybe np.int32?) Exception is raised from /opt/deephaven/venv/lib/python3.10/site-packages/deephaven/csv.py.

csv-read-datatype-exception.txt

@stanbrub stanbrub added bug Something isn't working triage labels Jun 11, 2024
@devinrsmith
Copy link
Member

dtypes.int_ was dtypes.int64 not dtypes.int32; try with dtypes.int64 to see if it works.

@stanbrub
Copy link
Contributor Author

stanbrub commented Jun 11, 2024

Here's an example of using int32 in the console that fails.

from deephaven import dtypes as dht, empty_table
from deephaven.csv import read, write

source = empty_table(10).update(['col1=i'])
write(source, '/data/dtypes.bug.csv')
result = read('/data/dtypes.bug.csv',{'int32':dht.int32})

If I read with int64, the result column is a 64 bit int. With int32, it fails with exception. With no types specified, the result column is a 32 bit int.

@devinrsmith
Copy link
Member

I noticed that there is a mis-mapping to parsers in csv.py, but it seems like there is a more fundamental issue (even after fixing the mappings):

from deephaven import read_csv, dtypes

actual = read_csv('primitive_types.csv', {
    'bool_': dtypes.bool_,
    'byte': dtypes.byte,
    'char': dtypes.char,
    'short': dtypes.short,
    'int32': dtypes.int32,
    'long': dtypes.long,
    'float32': dtypes.float32,
    'double': dtypes.double,
})
bool_,byte,char,short,int32,long,float32,double
true,42,a,42,42,42,42.42,42.42

The specs look correct:

image

The results wrong:

image

May need to file issue in https://github.com/deephaven/deephaven-csv

devinrsmith added a commit to devinrsmith/deephaven-core that referenced this issue Jun 11, 2024
devinrsmith added a commit that referenced this issue Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants