downloads data but gets columns wrong #6

ellisp · 2017-10-26T02:47:06Z

for example - mixes in id and date; cause and location - doesn't understand separate columns:

res <- search_data("name:fire", limit = 20)
res %>% filter(can_use == "yes") %>% slice(3) %>% get_data %>% View

ellisp · 2017-10-26T03:58:27Z

This looks tricky. Basically rio::import() doesn't parse everything as well as it should. To help, I've let the user add arguments to show_data that go through to import(), but in the end, not all data is easy to import...

HughParsonage · 2017-10-26T04:26:33Z

Yes, that particular file is quite strange. Every second row is blank and the download process has to try it multiply.

Also data.table complains not unreasonably about this line

856647,2009-05-01 04:34:00,20 ADELAIDE,FIP - NORMAL ON ARRIVAL, LINE FAULT/OPEN LINE

The comma after NORMAL ON ARRIVAL is part of the field, not a separator. But the field isn't quoted. readr::read_csv doesn't error, but also discards information (with stern warnings). base::read.csv gets the closest, but would require manual work.

HughParsonage · 2017-10-26T04:34:07Z

Perhaps the best we can do is gracefully error in cases like this, with a prayer to the end-user wishing the best of luck manually parsing it.

ellisp · 2017-10-27T00:20:34Z

This happens quite often (unsurprisingly) when the data are in Excel format. For example:

library(datagovau)
library(dplyr)
#----------------------business income------------------------
# an example of a dataset that doesn't import well - probably because it is in Excel


income_md <- search_data("name:income", limit = 1000)

business <- income_md %>%
  filter(name == "Business income by entity, state, industry and size for 2013-14 income year") %>%
  get_data()

business

#----------------------queensland and australian income----
qld <- income_md %>%
  filter(name == "Income of Qld and Aust 1999-2000 to 2013-14") %>%
  get_data()

qld

ellisp added the wontfix label Oct 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

downloads data but gets columns wrong #6

downloads data but gets columns wrong #6

ellisp commented Oct 26, 2017 •

edited

Loading

ellisp commented Oct 26, 2017

HughParsonage commented Oct 26, 2017 •

edited

Loading

HughParsonage commented Oct 26, 2017

ellisp commented Oct 27, 2017

downloads data but gets columns wrong #6

downloads data but gets columns wrong #6

Comments

ellisp commented Oct 26, 2017 • edited Loading

ellisp commented Oct 26, 2017

HughParsonage commented Oct 26, 2017 • edited Loading

HughParsonage commented Oct 26, 2017

ellisp commented Oct 27, 2017

ellisp commented Oct 26, 2017 •

edited

Loading

HughParsonage commented Oct 26, 2017 •

edited

Loading