Skip to content

kuriwaki/cces_cumulative

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CCES Cumulative File

Shiro Kuriwaki

This repository is R code to build the Cooperative Congressional Election Study (CCES) cumulative file (2006 - 2022).

Please feel free to file any questions or requests about the cumulative file as Github issues.

Getting Started

Start by downloading either the .dta, .Rds, or .feather file on the dataverse page to your computer. This repository does not track the data due to size constraints, but feel free to contact me if you need the newest version not on Dataverse. The .Rds format can be read into R.

dat <- readRDS("cumulative_2006-2022.Rds")

Make sure to load the tidyverse package first. The Rds file can be dealt with as a base-R data.frame, but it was built completely in the tidyverse environment so using it as a tibble gives full features.

library(tidyverse)
dat
## # A tibble: 617,455 × 103
##     year case_id weight weight_cumulative state            st      cong  cong_up
##  * <int> <chr>    <dbl>             <dbl> <int+lbl>        <int+l> <fct> <fct>  
##  1  2006 439219   1.85              1.67  37 [North Carol… 37 [NC] 109   110    
##  2  2006 439224   0.968             0.872 39 [Ohio]        39 [OH] 109   110    
##  3  2006 439228   1.59              1.44  34 [New Jersey]  34 [NJ] 109   110    
##  4  2006 439237   1.40              1.26  17 [Illinois]    17 [IL] 109   110    
##  5  2006 439238   0.903             0.813 36 [New York]    36 [NY] 109   110    
##  6  2006 439242   0.839             0.756 48 [Texas]       48 [TX] 109   110    
##  7  2006 439251   0.777             0.700 27 [Minnesota]   27 [MN] 109   110    
##  8  2006 439254   0.839             0.756 32 [Nevada]      32 [NV] 109   110    
##  9  2006 439255   0.331             0.299 48 [Texas]       48 [TX] 109   110    
## 10  2006 439263   1.10              0.993 24 [Maryland]    24 [MD] 109   110    
## # ℹ 617,445 more rows
## # ℹ 95 more variables: state_post <int+lbl>, st_post <int+lbl>, dist <int>,
## #   dist_up <int>, cd <chr>, cd_up <chr>, dist_post <int>, dist_up_post <int>,
## #   cd_post <chr>, cd_up_post <chr>, zipcode <chr>, county_fips <chr>,
## #   tookpost <int+lbl>, weight_post <dbl>, rvweight <dbl>, rvweight_post <dbl>,
## #   starttime <dttm>, pid3 <int+lbl>, pid3_leaner <int+lbl>, pid7 <int+lbl>,
## #   ideo5 <fct>, gender <int+lbl>, sex <int+lbl>, gender4 <int+lbl>, …

A Stata .dta can also be read in by Stata, or in R through haven::read_dta(). You will need the haven package loaded.

The arrow files can be loaded with arrow::read_feather(). They are currently modeled so that it would give the same output as reading the dta file.

Each row is a respondent, and each variable is information associated with that respondent. Note that this cumulative dataset extracts only a couple of key variables from each year’s CCES, which has hundreds of columns.

What’s New

Unified Variable Names

Most variables in this dataset come straight from each year’s CCES. However, it renames and standardizes variable names, making them accessible in one place. Please see the guide or the Crunch dataset for a full list and description of these variables.

Candidate Names and Identifiers