Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best way to deal with warnings #328

Closed
ldecicco-USGS opened this issue Dec 3, 2015 · 2 comments
Closed

Best way to deal with warnings #328

ldecicco-USGS opened this issue Dec 3, 2015 · 2 comments

Comments

@ldecicco-USGS
Copy link

I have many data sets that appear to have integers for > 1000 rows, but then it turns out they are numeric. I'm trying to come up with a nice way to handle that, but I'm stuck trying to figure out how to dynamically use the cols_only feature (and maybe that's just not the right way to go anyway):

readr.data <- suppressWarnings(read_delim(doc, skip = 2, delim="\t", col_names = FALSE))
badCols <- problems(readr.data)$col
badCols
[1] "X4" "X4"
unique.bad.cols <- unique(badCols)

readr.cols <- read_delim(doc, skip =2, delim="\t",col_names = FALSE, 
                               col_types = cols_only(unique.bad.cols = "n"))
Warning message:
The following named parsers don't match the column names: unique.bad.cols 

#I've also tried:
readr.cols <- read_delim(doc, skip = 2, delim="\t",col_names = FALSE, 
                               col_types = cols_only(setNames("n",unique.bad.cols)))
Error: not compatible with STRSXP
In addition: Warning message:
Unnamed `col_types` should have the same length as `col_names`. Using smaller of the two.

#Other attempts, re-grabbing and parsing all the data instead of just the bad columns:
#a: would like to use the variable unique.bad.cols in col_double
readr.data <- read_delim(doc, skip = 2, delim="\t",col_names = FALSE, 
                                col_types = col_double(unique.bad.cols))
Error in col_double(unique.bad.cols) : unused argument (unique.bad.cols)

#b: use setNames
readr.data <- read_delim(doc, skip = 2, delim="\t",col_names = FALSE, 
+                          col_types = cols(setNames("n",unique.bad.cols)))
Error: not compatible with STRSXP
In addition: Warning message:
Unnamed `col_types` should have the same length as `col_names`. Using smaller of the two. 

#This works just fine, but it won't always be "X4":
readr.cols <- read_delim(doc, skip = 2,delim="\t",col_names = FALSE, 
                          col_types = cols_only("X4"="n"))

After more digging, I realize I had assumed:
cols_only("X4"="n")
was the same as
cols_only(c("X4"="n"))
which it is not (hence the setName function doesn't help).

Any advice would be greatly appreciated. Thanks for the fantastic package!

@ldecicco-USGS
Copy link
Author

So here's what I ended up doing, it seems to work, I just don't want to miss something obvious from readr that would work:

    readr.data <- suppressWarnings(read_delim(doc, skip = 2, delim="\t", col_names = FALSE))
    badCols <- problems(readr.data)$col
    if(length(badCols) > 0){
      unique.bad.cols <- unique(badCols)
      readr.data.char <- read_delim(doc, skip = 2, delim="\t", col_names = FALSE, 
                                    col_types = cols(.default = "c"))
      readr.data[,unique.bad.cols] <- lapply(readr.data.char[,unique.bad.cols], parse_number)
    }

@hadley
Copy link
Member

hadley commented Jun 2, 2016

I think #401 will solve this problem

@hadley hadley closed this as completed Jun 2, 2016
@lock lock bot locked and limited conversation to collaborators Sep 25, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants