Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd.read_json converts large floats to inf #21454

Open
mtrudeau314 opened this issue Jun 12, 2018 · 2 comments
Open

pd.read_json converts large floats to inf #21454

mtrudeau314 opened this issue Jun 12, 2018 · 2 comments
Labels
Bug IO JSON read_json, to_json, json_normalize

Comments

@mtrudeau314
Copy link

mtrudeau314 commented Jun 12, 2018

Code Sample, a copy-pastable example if possible

pd.DataFrame({'A': [np.finfo(np.float64).max]}).to_json(orient='table')
'{"schema": {"fields":[{"name":"index","type":"integer"},{"name":"A","type":"number"}],"primaryKey":["index"],"pandas_version":"0.20.0"}, "data": [{"index":0,"A":1.797693135e+308}]}'

pd.read_json(pd.DataFrame({'A': [np.finfo(np.float64).max]}).to_json(orient='table'), orient='table')
     A
0  inf

Problem description

The to_json with orient='table' seems to be working properly. However, read_json with orient='table' is converting large floats to inf. I am not sure where the cutoff is for converting floats to inf, but read_json seems to work properly for smaller numbers.

Expected Output

pd.read_json(pd.DataFrame({'A': [np.finfo(np.float64).max]}).to_json(orient='table'), orient='table')
A
0 1.797693135e+308

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.23.0
pytest: None
pip: 10.0.1
setuptools: 39.0.1
Cython: None
numpy: 1.14.2
scipy: None
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@WillAyd WillAyd added the IO JSON read_json, to_json, json_normalize label Jun 12, 2018
@WillAyd
Copy link
Member

WillAyd commented Jun 12, 2018

Thanks for the report. May or may not be related to some of the work being done in #21408

@WillAyd
Copy link
Member

WillAyd commented Jun 13, 2018

On second glance I think this is a matter of precision when getting so close to the upper limit of your platform. Note that to_json by default provides 10 digits of precision, rounding up at the end. That is responsible for the below behavior in Python

In [7]: np.finfo(np.float64).max
Out[7]: 1.7976931348623157e+308

In [8]: 1.7976931348623157e+308
Out[8]: 1.7976931348623157e+308

In [9]: 1.797693135e+308  # What appeared in to_json
Out[9]: inf

I suppose this could theoretically be avoided on your end by increasing the double_precision on output to 17, though that raises an error in an of itself:

In [23]: pd.DataFrame({'A': [np.finfo(np.float64).max]}).to_json(orient='table', double_precision=17)
~/clones/pandas/pandas/io/json/json.py in _write(self, obj, orient, double_precision, ensure_ascii, date_unit, iso_dates, default_handler)
    110             date_unit=date_unit,
    111             iso_dates=iso_dates,
--> 112             default_handler=default_handler
    113         )
    114 

ValueError: Invalid value '17' for option 'double_precision', max is '15'

Investigation into the feasibility of a 17 digit limit and PRs are certainly welcome!

@WillAyd WillAyd added Numeric Operations Arithmetic, Comparison, and Logical operations Effort Low labels Jun 13, 2018
@gfyoung gfyoung added the Bug label Jun 13, 2018
@mroeschke mroeschke removed the Numeric Operations Arithmetic, Comparison, and Logical operations label Jun 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

No branches or pull requests

4 participants