[Feature Request] Print column specification with write_csv #895

bschneidr · 2018-10-10T21:15:56Z

When a csv created by write_csv() is read into another script, it would be helpful to have the column specification already produced as a side-effect of the write_csv() call that produced it.

Here's one way that could look.

After write_csv() is used, a message is printed with the column specification matching the datatypes of the dataframe that was input to write_csv().

library(tibble)
library(lubridate)
library(readr)

df <- data_frame(ID = c("01", "42"),
                 Date = ymd(c("2018-10-31", "2010-10-30")))

write_csv(df, path = "Data.csv")

#> To read "Data.csv", you can use the following column specification:
#> cols(
#>      ID = col_character(),
#>      Date = col_date(format = "")
#>      )

This would make it easier to integrate write_csv() into a data-processing pipeline that plays nicely with Git.

The text was updated successfully, but these errors were encountered:

mpettis · 2019-03-19T16:37:03Z

I'd like to second this request, but add that it would be nice to not just display the column spec, but return them as first-class objects of some sort. I was thinking of something like:

df <- tibble(a=1L, b=1.0, c="a", d=TRUE, e=ymd_hms("2019-03-19T13:15:18Z"), f=ymd("2019-03-19"))

# Proposed functions:
# gen_spec(df)
#< cols(
#<   a=col_integer(),
#<   b=col_double(),
#<   c=col_character(),
#<   d=col_logical(),
#<   e=col_datetime(),
#<   f=col_date()
#< )
#<
#< gen_spec_short(df)
#< "idclTD"

Thank you for the work and consideration.

mpettis · 2019-03-19T20:45:07Z

I have also asked this question (as to how others have done this) and posted my local solution here: https://stackoverflow.com/q/55249599/1022967

at062084 · 2019-03-28T16:23:31Z

Proposed function / workflow to migrate from base R data frames to readr

prepare migration to readr::read_*
df <- some base R data frame
' new method to extract current col_types
df.col_types <- spec_extract(df)
saveRDS(df.col_types, "df.col_types.rds")
write_delim(df, path="df.csv")
read
df <- read_delim("df.csv", col_types = readRDS("df.col_types.rds"))

jimhester · 2019-05-03T16:52:17Z

So you can now generate a column specification from any data.frame with as.col_spec(df) and also optionally convert it to the concise string representation with as.character(), which should be all you need to do this fairly easily.

Currently I don't think it makes sense to print the spec out by default, but we can revisit it in a separate issue if needed.

bschneidr · 2019-05-03T19:00:56Z

This is fantastic. Thank you!
Since the as.col_type(df) function call is so simple, it seems like there's little marginal value in having write_csv() automatically print the column specification as a side effect.

Dulani · 2019-09-26T03:33:22Z

@jimhester You said:

So you can now generate a column specification from any data.frame with as.col_type(df) and also optionally convert it to the concise string representation with as.character(), which should be all you need to do this fairly easily.

I see that there is a new as.col_spec(df) function, but I can't find an as.col_type(df) anywhere in the current (1.3.1) version or by searching this repository on Github. What am I missing?

What I'd really like to do is what you suggest in your post. Take a data frame or its column specification and automatically generate the concise string representation of the column specification.

In other words, something like this:
as.col_type(mtcars) %>% as.character(). Is this doable using existing functionality in readr (as you suggest?)

jimhester · 2019-09-26T13:02:59Z

It was a typo, I meant as.col_spec(df).

Dulani · 2019-09-28T03:18:04Z

Thanks @jimhester! However, as.col_spec(mtcars) %>% as.character() does not produce a concise string representation. The first part as.col_spec(mtcars) produces the error: Error: col_types must be NULL, a list or a string.

The example in the documentation does convert a concise string representation into a col_spec: as.col_spec("cccnnn"), but I still haven't figured out how to take an existing data frame and generate a concise specification from it. Does that functionality already exist within readr or should it be an "open" feature request?

jimhester · 2019-09-30T12:37:39Z

Yes it does, you need to use the development version of readr.

as.character(readr::as.col_spec(mtcars))
#> [1] "ddddddddddd"
packageVersion("readr")
#> [1] '1.3.1.9000'

^{Created on 2019-09-30 by the reprex package (v0.3.0)}

lock · 2020-04-02T14:27:34Z

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

jimhester added the feature a feature request or enhancement label Nov 13, 2018

jimhester added this to the backlog milestone Nov 15, 2018

jimhester closed this as completed in fa1d855 May 3, 2019

lock bot locked and limited conversation to collaborators Apr 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Print column specification with write_csv #895

[Feature Request] Print column specification with write_csv #895

bschneidr commented Oct 10, 2018

mpettis commented Mar 19, 2019

mpettis commented Mar 19, 2019

at062084 commented Mar 28, 2019 •

edited

Loading

jimhester commented May 3, 2019 •

edited

Loading

bschneidr commented May 3, 2019

Dulani commented Sep 26, 2019

jimhester commented Sep 26, 2019

Dulani commented Sep 28, 2019

jimhester commented Sep 30, 2019

lock bot commented Apr 2, 2020

[Feature Request] Print column specification with write_csv #895

[Feature Request] Print column specification with write_csv #895

Comments

bschneidr commented Oct 10, 2018

mpettis commented Mar 19, 2019

mpettis commented Mar 19, 2019

at062084 commented Mar 28, 2019 • edited Loading

jimhester commented May 3, 2019 • edited Loading

bschneidr commented May 3, 2019

Dulani commented Sep 26, 2019

jimhester commented Sep 26, 2019

Dulani commented Sep 28, 2019

jimhester commented Sep 30, 2019

lock bot commented Apr 2, 2020

at062084 commented Mar 28, 2019 •

edited

Loading

jimhester commented May 3, 2019 •

edited

Loading