Shiro Kuriwaki
This repository is R code to build the Cooperative Congressional Election Study (CCES) cumulative file (2006 - 2022).
Please feel free to file any questions or requests about the cumulative file as Github issues.
Start by downloading either the .dta
, .Rds
, or .feather
file on
the dataverse
page
to your computer. This repository does not track the data due to size
constraints, but feel free to contact me if you need the newest version
not on Dataverse. The .Rds
format can be read into R.
dat <- readRDS("cumulative_2006-2022.Rds")
Make sure to load the tidyverse
package first. The Rds file can be
dealt with as a base-R data.frame, but it was built completely in the
tidyverse
environment so using it as a tibble
gives full features.
library(tidyverse)
dat
## # A tibble: 617,455 × 103
## year case_id weight weight_cumulative state st cong cong_up
## * <int> <chr> <dbl> <dbl> <int+lbl> <int+l> <fct> <fct>
## 1 2006 439219 1.85 1.67 37 [North Carol… 37 [NC] 109 110
## 2 2006 439224 0.968 0.872 39 [Ohio] 39 [OH] 109 110
## 3 2006 439228 1.59 1.44 34 [New Jersey] 34 [NJ] 109 110
## 4 2006 439237 1.40 1.26 17 [Illinois] 17 [IL] 109 110
## 5 2006 439238 0.903 0.813 36 [New York] 36 [NY] 109 110
## 6 2006 439242 0.839 0.756 48 [Texas] 48 [TX] 109 110
## 7 2006 439251 0.777 0.700 27 [Minnesota] 27 [MN] 109 110
## 8 2006 439254 0.839 0.756 32 [Nevada] 32 [NV] 109 110
## 9 2006 439255 0.331 0.299 48 [Texas] 48 [TX] 109 110
## 10 2006 439263 1.10 0.993 24 [Maryland] 24 [MD] 109 110
## # ℹ 617,445 more rows
## # ℹ 95 more variables: state_post <int+lbl>, st_post <int+lbl>, dist <int>,
## # dist_up <int>, cd <chr>, cd_up <chr>, dist_post <int>, dist_up_post <int>,
## # cd_post <chr>, cd_up_post <chr>, zipcode <chr>, county_fips <chr>,
## # tookpost <int+lbl>, weight_post <dbl>, rvweight <dbl>, rvweight_post <dbl>,
## # starttime <dttm>, pid3 <int+lbl>, pid3_leaner <int+lbl>, pid7 <int+lbl>,
## # ideo5 <fct>, gender <int+lbl>, sex <int+lbl>, gender4 <int+lbl>, …
A Stata .dta
can also be read in by Stata, or in R through
haven::read_dta()
. You will need the haven
package loaded.
The arrow files can be loaded with arrow::read_feather()
. They are
currently modeled so that it would give the same output as reading the
dta file.
Each row is a respondent, and each variable is information associated with that respondent. Note that this cumulative dataset extracts only a couple of key variables from each year’s CCES, which has hundreds of columns.
Most variables in this dataset come straight from each year’s CCES. However, it renames and standardizes variable names, making them accessible in one place. Please see the guide or the Crunch dataset for a full list and description of these variables.