Don't print decimals if no value in vector has decimals. #62

strengejacke · 2017-11-03T09:15:00Z

When importing data from other software packages into R (e.g. from Stata, SAS or SPSS, using haven), vector are of type double, even if they are integers.

Would you mind checking if a vector has "floating point" values, or are actually "interger-doubles", and then omit the decimals? (something like is.numeric(x) && !all(x %% 1 == 0, na.rm = T))

Current output:

library(tibble)
tibble(a = c(1, 2, 3), b = c(1L, 2L, 3L))
#> # A tibble: 3 x 2
#>       a     b
#>   <dbl> <int>
#> 1  1.00     1
#> 2  2.00     2
#> 3  3.00     3

Since all values in a are "integers", the desired output would be like column b. The problem is, that this is a guess, if it's a double or probably was intended as integer. But I can think of (new) R users being confused when they see their values in the SPSS data sheet as "integers", and in the R console as doubles.

The text was updated successfully, but these errors were encountered:

strengejacke · 2017-11-03T09:30:06Z

Another example: When I print the vector alone, the desired output is shown:

library(tibble)
x <- tibble(a = c(1, 2, 3), b = c(1L, 2L, 3L), c = c(1.23, 1.34, 1.45))
x$a
#> [1] 1 2 3
x$b
#> [1] 1 2 3
x$c
#> [1] 1.23 1.34 1.45

Same behaviour would be nice for tibbles as well - I hope you think this makes sense.

krlmlr · 2018-01-10T22:46:51Z

@hadley: Should we omit the dot and the zeros after the dot if we only see whole numbers? Current output:

pillar::pillar(as.numeric(1:3))
#> <dbl>
#>  1.00
#>  2.00
#>  3.00

Created on 2018-01-10 by the reprex package (v0.1.1.9000)

hadley · 2018-01-10T23:32:32Z

Hmmm, I don't think it's a good idea to do this. For performance reasons, we can only inspect the rows being printed, so this seems potentially misleading to me.

krlmlr · 2018-01-11T08:57:25Z

I've thought about that, too. What's the chance that a column contains fractions if the first 10 entries don't have any? If we assume that only .0 and .5 are present, and a uniform distribution, that's < 0.1%. If we assume .0 through .9, that's 10⁻¹⁰. We have the type indicator, too.

The digits.secs option also triggers fractional seconds only for the displayed data.

We can really fix this only if the column contains some metadata that describes all values.

Maybe make this an option? Printing only the dot but not the trailing zeros doesn't look appealing to me:

pillar::pillar(as.numeric(1:3))
#> <dbl>
#>    1.
#>    2.
#>    3.

hadley · 2018-01-11T13:03:53Z

Based on readr experience, quite high.

I'd rather not add more options.

ghost · 2018-01-27T20:39:46Z

I'd agree with @hadley, that there are many cases where the first 10 entries don't include any digits to the right of the decimal, while somewhere in the data they do, but I'm not sure that's more common than the other way around. You may be trying to avoid a common but minority misrepresentation by using a method that misrepresents the data the majority of the time.

I understand the performance benefit for only checking the rows that are printed. If that's the way pillar displays data (check the rows you print), why not have that be the data you're representing (the rows you print)?

Trailing zeros have a meaning. They mean somewhere in this data there is an entry with values to the right of the decimal. If printing using pillar is supposed to give you information about ALL the data (instead of just the data it prints) while only checking a portion, you're either going to have to find some magic, cache the checks of the entire data when an object is created (change other packages), or decide between two cases where the wrong meaning is displayed (as a trade off for the performance). In one case the display tells you there are later, unprinted entries with values to the right of the decimal (when there aren't); in the other case the display tells you there are no later unprinted entries with values to the right of the decimal (when there are). I'm not sure it's clear that the second option (the new way tibbles print) is better than the first.

I'd also add that this behavior for data with no values to the right of the decimal is not the way printing that same data was handled by tibbles previously. So, though different things surprise different people, this will be surprising (at least for a while) for most users of the tidyverse.

krlmlr · 2018-02-07T08:48:05Z

Closing in favor of #40: Adding a trailing dot but without decimals in these cases.

Display ------- - Turned off using subtle style for digits that are considered insignificant. Set the new option `pillar.subtle_num` to `TRUE` to turn it on again (default: `FALSE`). - The negation sign is printed next to the number again (#91). - Scientific notation uses regular digits again for exponents (#90). - Groups of three digits are now underlined, starting with the fourth before/after the decimal point. This gives a better idea of the order of magnitude of the numbers (#78). - Logical columns are displayed as `TRUE` and `FALSE` again (#95). - The decimal dot is now always printed for numbers of type `numeric`. Trailing zeros are not displayed anymore if all displayed numbers are whole numbers (#62). - Decimal values longer than 13 characters always print in scientific notation. Bug fixes --------- - Numeric values with a `"class"` attribute (e.g., `Duration` from lubridate) are now formatted using `format()` if the `pillar_shaft()` method is not implemented for that class (#88). - Very small numbers (like `1e-310`) are now printed corectly (tidyverse/tibble#377). - Fix representation of right-hand side for `getOption(pillar.sigfig) >= 6` (tidyverse/tibble#380). - Fix computation of significant figures for numbers with absolute value >= 1 (#98). New functions ------------- - New styling helper `style_subtle_num()`, formatting depends on the `pillar.subtle_num` option.

Display ------- - Turned off using subtle style for digits that are considered insignificant. Negative numbers are shown all red. Set the new option `pillar.subtle_num` to `TRUE` to turn it on again (default: `FALSE`). - The negation sign is printed next to the number again (#91). - Scientific notation uses regular digits again for exponents (#90). - Groups of three digits are now underlined, starting with the fourth before/after the decimal point. This gives a better idea of the order of magnitude of the numbers (#78). - Logical columns are displayed as `TRUE` and `FALSE` again (#95). - The decimal dot is now always printed for numbers of type `numeric`. Trailing zeros are not shown anymore if all displayed numbers are whole numbers (#62). - Decimal values longer than 13 characters always print in scientific notation. Bug fixes --------- - Numeric values with a `"class"` attribute (e.g., `Duration` from lubridate) are now formatted using `format()` if the `pillar_shaft()` method is not implemented for that class (#88). - Very small numbers (like `1e-310`) are now printed corectly (tidyverse/tibble#377). - Fix representation of right-hand side for `getOption("pillar.sigfig") >= 6` (tidyverse/tibble#380). - Fix computation of significant figures for numbers with absolute value >= 1 (#98). New functions ------------- - New styling helper `style_subtle_num()`, formatting depends on the `pillar.subtle_num` option.

github-actions · 2020-12-09T00:52:44Z

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

ilarischeinin mentioned this issue Jan 11, 2018

Control significant figures with an option #72

Closed

krlmlr closed this as completed Feb 7, 2018

github-actions bot locked and limited conversation to collaborators Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't print decimals if no value in vector has decimals. #62

Don't print decimals if no value in vector has decimals. #62

strengejacke commented Nov 3, 2017 •

edited

Loading

strengejacke commented Nov 3, 2017

krlmlr commented Jan 10, 2018

hadley commented Jan 10, 2018

krlmlr commented Jan 11, 2018

hadley commented Jan 11, 2018

ghost commented Jan 27, 2018 •

edited by ghost

Loading

krlmlr commented Feb 7, 2018

github-actions bot commented Dec 9, 2020

Don't print decimals if no value in vector has decimals. #62

Don't print decimals if no value in vector has decimals. #62

Comments

strengejacke commented Nov 3, 2017 • edited Loading

strengejacke commented Nov 3, 2017

krlmlr commented Jan 10, 2018

hadley commented Jan 10, 2018

krlmlr commented Jan 11, 2018

hadley commented Jan 11, 2018

ghost commented Jan 27, 2018 • edited by ghost Loading

krlmlr commented Feb 7, 2018

github-actions bot commented Dec 9, 2020

strengejacke commented Nov 3, 2017 •

edited

Loading

ghost commented Jan 27, 2018 •

edited by ghost

Loading