Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak DataFrame formatting to always use same # of digits #395

Closed
wesm opened this issue Nov 21, 2011 · 3 comments
Closed

Tweak DataFrame formatting to always use same # of digits #395

wesm opened this issue Nov 21, 2011 · 3 comments
Milestone

Comments

@wesm
Copy link
Member

wesm commented Nov 21, 2011

No description provided.

@lodagro
Copy link
Contributor

lodagro commented Jan 8, 2012

Now that formatting always uses same number of digits, one can wonder what does precision mean?
One gets more or less precision depending on the columns content.

In [1]: import numpy as np

In [2]: import pandas

In [3]: df = pandas.DataFrame({'A': [-1, 1e-2, 123456789, np.pi, np.sqrt(2)]})

In [4]: df
Out[4]: 
   A        
0 -1.0000000
1  0.0100000
2  1.235e+08
3  3.1415927
4  1.4142136

By the way precision seems to be one digit off target when using the default float formatter, ok when using the eng one.

In [5]: df = pandas.DataFrame({'A': [np.pi, np.sqrt(2)]})

In [6]: df
Out[6]: 
   A    
0  3.142
1  1.414

In [7]: pandas.core.common._precision
Out[7]: 4

In [8]: pandas.set_printoptions(precision=3)

In [9]: pandas.core.common._precision
Out[9]: 3

In [10]: df
Out[10]: 
   A   
0  3.14
1  1.41

In [11]: pandas.set_eng_float_format(use_eng_prefix=False, precision=3)

In [12]: df
Out[12]: 
   A        
0  3.142E+00
1  1.414E+00

In [13]: df = pandas.DataFrame({'A': [-1, 1e-2, 123456789, np.pi, np.sqrt(2)]})

In [14]: df
Out[14]: 
   A          
0 -1.000E+00  
1  10.000E-03 
2  123.457E+06
3  3.142E+00  
4  1.414E+00  

In [15]: pandas.core.common._precision
Out[15]: 3

@adamklein
Copy link
Contributor

I took precision to be number of significant digits, in which case to
me it looks like the default float formatter is actually the right
number of digits here, vs the eng one. It will be good to resolve this
ambiguity.

On Jan 8, 2012, at 2:22 PM, Wouter Overmeire
reply@reply.github.com
wrote:

Now that formatting always uses same number of digits, one can wonder what does precision mean?
One gets more or less precision depending on the columns content.

In [1]: import numpy as np

In [2]: import pandas

In [3]: df = pandas.DataFrame({'A': [-1, 1e-2, 123456789, np.pi, np.sqrt(2)]})

In [4]: df
Out[4]:
  A
0 -1.0000000
1  0.0100000
2  1.235e+08
3  3.1415927
4  1.4142136

By the way precision seems to be one digit off target when using the default float formatter, ok when using the eng one.

In [5]: df = pandas.DataFrame({'A': [np.pi, np.sqrt(2)]})

In [6]: df
Out[6]:
  A
0  3.142
1  1.414

In [7]: pandas.core.common._precision
Out[7]: 4

In [8]: pandas.set_printoptions(precision=3)

In [9]: pandas.core.common._precision
Out[9]: 3

In [10]: df
Out[10]:
  A
0  3.14
1  1.41

In [11]: pandas.set_eng_float_format(use_eng_prefix=False, precision=3)

In [12]: df
Out[12]:
  A
0  3.142E+00
1  1.414E+00

In [13]: df = pandas.DataFrame({'A': [-1, 1e-2, 123456789, np.pi, np.sqrt(2)]})

In [14]: df
Out[14]:
  A
0 -1.000E+00
1  10.000E-03
2  123.457E+06
3  3.142E+00
4  1.414E+00

In [15]: pandas.core.common._precision
Out[15]: 3

Reply to this email directly or view it on GitHub:
https://github.com/wesm/pandas/issues/395#issuecomment-3403865

@lodagro
Copy link
Contributor

lodagro commented Jan 8, 2012

Ok, so it's a matter of definition.

I think definitions are:
precision = the effective number of decimal digits
accuracy = is the effective number of these digits which appear to the right of the decimal point
But i see precision often used when one means accuracy.

Then eng float formatter uses precision argument to indicate the effective number of digits which appear to the right of the decimal point. So for pandas maybe better to use accuracy for eng formatter iso precision (as matplotlib does).

What pandas means by precision is the minimum number of digits, since the actual precision (as defined above) depends on the column content, see my example in previous comment.

wesm pushed a commit that referenced this issue Jan 9, 2012
… accuracy

Conflicts:

	pandas/core/common.py
@wesm wesm closed this as completed Jan 10, 2012
dan-nadler pushed a commit to dan-nadler/pandas that referenced this issue Sep 23, 2019
Update CHANGES.md and fixtures tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants