Skip to content

Tools for compiling multiple waves of Understanding Society into one tidy data frame.

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

wklimowicz/tidyusoc

Repository files navigation

tidyusoc

R-CMD-check

Installation

You can install the development version of tidyusoc from GitHub with:

# install.packages("devtools")
devtools::install_github("wklimowicz/tidyusoc")

Setup

After downloading the Understanding Society Data from the UK Data Service website, run:

library(tidyusoc)

# Convert to Rds to save space and time
usoc_convert(usoc_directory = "spss/spss25",
             new_directory = "rds",
             filter_files = "indresp")

# Compile the INDRESP files in the rds folder
usoc <- usoc_compile(directory = "rds")

A small set of core variables is included by default:

Variable Description
pidp Personal ID
age Age
race Race/Ethnicity
sex Sex
country Country
gor_dv Government Office Region
qfhigh_dv Highest Qualification (Detailed, UKHLS Only)
hiqual_dv Highest Qualification
jbstat Job Status
jbsoc00_cc Occupation
fimnlabgrs_dv Labour Income

To add more variables, use the extra_mappings function. For variables that change over time use pick_var. For example, life satisfaction is:

  • lfsato in waves 6-10 and 12-18 of the BHPS.
  • sclfsato in waves 1-11 of the UKHLS.

The variable BMI (bmi_dv) doesn’t change over time, so it’s passed as a string.

custom_mappings <- function(cols) {

    life_sat <- pick_var(c("sclfsato", "lfsato"), cols)

    custom_variables <- tibble::tribble(
        ~usoc_name, ~new_name, ~type,
        life_sat, "life_satisfaction", "factor",
        "bmi_dv", "bmi_dv", "numeric"
    )

    return(custom_variables)

}

usoc <- usoc_compile("rds", extra_mappings = custom_mappings)

A good place to start looking for variables is the Understanding Society Variable Search, and the Documentation.

Compiling different surveys.

By default, these scripts compile the indresp file - you can change this with the file argument. For example, to compile the youth response files:

usoc_convert(usoc_directory = "spss/spss25",
             new_directory = "rds",
             filter_files = "youth")

usoc <- usoc_compile(directory = "rds", file = "youth")

Using the data across multiple projects.

Adding an environment variable called DATA_DIRECTORY pointing at a folder, and using the save_to_folder argument saves an fst of the final dataset, so it can be loaded from anywhere.

library(tidyusoc)

usoc_compile(directory = "rds", save_to_folder = TRUE)

usoc <- usoc_load()

To read any of the other compiled files change the file argument:

usoc <- usoc_load(file = "youth")

About

Tools for compiling multiple waves of Understanding Society into one tidy data frame.

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks