Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract fails for EGIR01DT #81

Open
aftollefsen opened this issue Mar 7, 2019 · 2 comments
Open

Extract fails for EGIR01DT #81

aftollefsen opened this issue Mar 7, 2019 · 2 comments
Labels
parser exceptions odd files, and parser errors

Comments

@aftollefsen
Copy link

Great package. I have some issues when trying to extract a bunch of IR files with the option add_geo=TRUE. The goal of the script is to download all IR surveys for Africa and merge these.

Here is my code:

# Extract list of DHS countries
countries <- dhs_countries()

# Identify countries where regions is Africa
uniqueregions <- unique(countries$SubregionName)
africaregions <- uniqueregions[agrep("africa",uniqueregions)]

# Subset countries data frame with only africa
subcountries <- countries[which(countries$SubregionName %in% africaregions),]

# Identify surveys that match Individual Recode (IR) and the countries you want (example shows first five countries))
hrdatasets <- dhs_datasets(fileType = "IR",countryIds = subcountries$DHS_CountryCode,fileFormat = "STATA")

# Download datasets matching the above query
hrdownloads <- get_datasets(dataset_filenames = hrdatasets$FileName)

# Search within downloaded surveys if variables exist
vars <- search_variables(names(hrdownloads), variables = c("v190","v012","v106","v151"))

# Extract DHS surveys with only variables selected in the previous step
extract <- extract_dhs(questions = vars, add_geo = TRUE)

This starts, but then fails:
Starting Survey 1 out of 182 surveys:AOIR51dt
Starting Survey 2 out of 182 surveys:AOIR62DT
.
.
.
Starting Survey 31 out of 182 surveys:CDIR61DT
Starting Survey 32 out of 182 surveys:CIIR35DT
Starting Survey 33 out of 182 surveys:CIIR3ADT
Starting Survey 34 out of 182 surveys:CIIR50DT
Starting Survey 35 out of 182 surveys:CIIR62DT
Starting Survey 36 out of 182 surveys:EGIR01DT
Error in vapply(r, attr, character(1), "label", exact = TRUE) :
values must be length 1,
but FUN(X[[1]]) result is length 0

# Convert list of data.frames into one single data.frame using dplyr
library(dplyr)
dhsdf <- bind_rows(extract, .id = "survey")

@aftollefsen
Copy link
Author

Same for
Starting Survey 44 out of 170 surveys:GHIR02DT
Error in vapply(r, attr, character(1), "label", exact = TRUE) :
values must be length 1,
but FUN(X[[70]]) result is length 0

@OJWatson
Copy link
Collaborator

Hi there,

Thanks for flagging this up and sorry for the slow reply. This issue is mainly due to the earliest recodes having slightly odd stata files, so that when we read the data in we don't correctly grab the labels attached to the data. @jeffeaton and I have been chatting about this and will look into trying to get it working for the stata files.

However, if you download those datasets using the FLAT file format then those 2 survey datasets work. We generally would recommend always downloading the flat files as they are more reliable and I think there are only 6 surveys that don't work as flat downloads (see #5) and we are working on a work around for those.

So just adapt the code above you provided with the following and hopefully you should be all okay (:crossed_fingers:) :

# Identify surveys that match Individual Recode (IR) and the countries you want (example shows first five countries))
hrdatasets <- dhs_datasets(fileType = "IR",countryIds = subcountries$DHS_CountryCode,fileFormat = "flat")

@OJWatson OJWatson added the parser exceptions odd files, and parser errors label Aug 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parser exceptions odd files, and parser errors
Projects
None yet
Development

No branches or pull requests

2 participants