- Loading MEPS data
- Automating file download
- Saving SAS data (.sas7bdat)
- SAS SURVEY procedures
- SAS examples
For data years 2017 and later (and also for the 2016 Medical Conditions file), .zip files for multiple file formats are available, including ASCII (.dat), SAS V9 (.sas7bdat), Stata (.dta), and Excel (.xlsx). Prior to 2017, ASCII (.dat) and SAS transport (.ssp) files are provided for all datasets.
The recommended file formats are the SAS V9 data files (.sas7bdat) for data years 2017 and later, and the SAS transport (.ssp) format for data years 1996-2016.
The SAS V9 (.sas7bdat) format is the recommended format for loading MEPS data files from 2017 and later (and also for the 2016 Medical Conditions file). For the following example, the 2018 Dental Visits files (h206b.sas7bdat) has been downloaded from the MEPS website, unzipped, and saved in the local directory C:/MEPS:
DATA work.h206b;
SET "C:/MEPS/h206b.sas7bdat";
RUN;
/* View first 10 rows of data */
PROC PRINT data = h206b (obs=10);
RUN;
For data years prior to 2017, ASCII and SAS transport (XPORT) file formats were released for the MEPS public use files. The SAS transport (.ssp) format is the recommended file type for loading MEPS data files from 1996-2016 (excluding the 2016 Medical Conditions file).
SAS transport (.ssp) files can be read into SAS using PROC XCOPY
. In the following examples, the SAS transport file for the 2016 Dental Visits file (h188b.ssp) has been downloaded from the MEPS website, unzipped, and saved in the local directory C:\MEPS
FILENAME in_h188b "C:\MEPS\h188b.ssp";
PROC XCOPY in = in_h188b out = WORK IMPORT;
RUN;
/* View first 10 rows of data */
PROC PRINT data = h188b (obs=10);
RUN;
Instead of having to manually download, unzip, and store MEPS data files in a local directory, it may be beneficial to automatically download MEPS data directly from the MEPS website.
The following code downloads and unzips the 2018 Dental Visits (h206b) directly from the MEPS website and stores it in the "C:/MEPS" folder. This code is adapted from SAS blogs by Chris Hemedinger and a macro created by Pradip Muhuri:
/* You must assign macro variables: MEPS file name, URL, and local directory where files will be stored*/
%let meps_file = h206b;
%let meps_url = https://meps.ahrq.gov/mepsweb/data_files/pufs/h206b/h206bv9.zip;
%let meps_dir = C:/MEPS/sas_data;
/* DO NOT EDIT this section *******************************/
/* Download zip file from MEPS website to specified directory (meps_dir) */
filename zipfile "&meps_dir/&meps_file.v9.zip";
proc http
url = "&meps_url"
out = zipfile;
run;
/* Unzip SAS dataset and save in specified directory */
filename inzip ZIP "&meps_dir/&meps_file.v9.zip";
filename sasfile "&meps_dir/&meps_file..sas7bdat" ;
data _null_;
infile inzip(&meps_file..sas7bdat)
lrecl=256 recfm=F length=length eof=eof unbuf;
file sasfile lrecl=256 recfm=N;
input;
put _infile_ $varying256. length;
return;
eof:
stop;
run;
/* End of DO NOT EDIT section ***************************/
/* Read in the saved SAS V9 dataset */
data dn2018;
set "&meps_dir/&meps_file..sas7bdat";
run;
/* View first 5 rows of dataset */
proc print data = dn2018 (obs = 5);
run;
To download additional files programmatically, replace 'h206b' in the above code with the desired filename (see meps_files_names.csv for a list of MEPS file names by data type and year). The full URL for the url
macro variable can be found by right-clicking the 'ZIP' hyperlink on the web page for the data file, selecting 'Copy link address', then pasting into a text editor or code editor.
Once the MEPS data has been loaded into SAS, it can be saved as a permanent SAS dataset (.sas7bdat). In the following code, the h206b dataset is saved in the 'SAS\data' folder (first create the 'MEPS\SAS\data' folder if needed):
LIBNAME sasdata 'C:\MEPS\SAS\data';
data sasdata.dn2018;
set WORK.h206b;
run;
To analyze MEPS data using SAS, the following steps are recommended to ensure unbiased estimates and proper standard errors (from SAS Global Forum Paper 4113-2020 by David R. Nelson and Siew Wong-Jacobson):
- Always use the SAS SURVEY procedures (e.g. SURVEYMEANS, SURVEYREG)
- Always use the cluster (e.g. VARPSU), strata (e.g. VARSTR), and appropriate weights (e.g. PERWT18F)
- Do not delete observations or use BY or WHERE statements. Instead, create an analytical subset for use as a DOMAIN; analyzing the subgroup alone may affect the standard errors.
As an example, the following code will estimate the total healthcare expenditures in 2018:
proc surveymeans data = h206b sum;
stratum VARSTR;
cluster VARPSU;
weight PERWT18F;
var DVXP18X;
run;
In order to run the example codes, you must download the relevant MEPS files from the MEPS website and save them to your local computer.
Example codes from previous MEPS workshops and webinars are provided in the workshop_exercises folder. Each exercise contains three files: SAS code (e.g. Exercise1.sas), a SAS log file (e.g. Exercise1_log.TXT) and a SAS output file (e.g. Exercise1_OUTPUT.TXT):
exercise_1a: National health care expenses by age group, 2016
exercise_1b: National health care expenses by age group and type of service, 2015
exercise_1c: National health care expenses by age group, 2018
exercise_2a: Trends in antipsychotics purchases and expenses, 2015
exercise_2b: Purchases and expenses for narcotic analgesics or narcotic analgesic combos, 2016
exercise_2c: Purchases and expenses for narcotic analgesics or narcotic analgesic combos, 2018
exercise_3a: Use and expenditures for persons with diabetes, 2015
exercise_3b: Expenditures for all events associated with diabetes, 2015
exercise_4a: Pooling MEPS FYC files, 2015 and 2016: Out-of-pocket expenditures for unisured persons ages 26-30 with high income
exercise_4b: Pooling longitudinal files, panels 17-19
exercise_4c: Pooling MEPS FYC files, 2017 and 2018: People with joint pain, using JTPAIN31 for 2017 and JTPAIN31_M18 for 2018
exercise_4d: Pooling MEPS FYC files, 2017-2019: People with joint pain, using Pooled Linkage Variance file for correct standard error calculation (required when pooling before and after 2019)
exercise_5a: Constructing family-level variables from person-level data, 2015
exercise_5b: Constructing insurance status from monthly insurance variables, 2015
exercise_6a: Logistic regression to identify demographic factors associated with receiving a flu shot in 2018 (using SAQ population)
exercise_6b: Logistic regression for persons that delayed medical care because of COVID, 2020
cond_pmed_2020.sas: Utilization and expenditures for prescribed medicine purchases for hyperlipidemia, 2020
cond_mv_2020.sas: Utilization and expenditures for office-based visits for mental health, 2020
The following codes provided in the summary_tables_examples folder re-create selected statistics from the MEPS-HC Data Tools. These example codes are written under the assumption that the .ssp files are saved in the local directory "C:/MEPS/". However, you can customize the programs to point to an alternate directory.
care_access_2017.sas:
Reasons for difficulty receiving needed care, by poverty status, 2017
care_access_2019.sas:
Number and percent of people who did not receive treatment because they couldn't afford it, by poverty status, 2019
care_diabetes_a1c_2016.sas: Adults with diabetes receiving hemoglobin A1c blood test, by race/ethnicity, 2016
care_quality_2016.sas: Ability to schedule a routine appointment, by insurance coverage, 2016
cond_expenditures_2015.sas: Utilization and expenditures by medical condition, 2015 -- Conditions defined by collapsed ICD-9/CCS codes
cond_expenditures_2018.sas: Utilization and expenditures by medical condition, 2018 -- Conditions defined by collapsed ICD-10/CCSR codes
ins_age_2016.sas: Health insurance coverage by age group, 2016
pmed_prescribed_drug_2016.sas: Purchases and expenditures by generic drug name, 2016
pmed_therapeutic_class_2016.sas: Purchases and expenditures by Multum therapeutic class, 2016
use_events_2016.sas: Number of events and mean expenditure per event, for office-based and outpatient events, by source of payment, 2016
use_expenditures_2016.sas: Expenditures for office-based and outpatient visits, by source of payment, 2016
use_expenditures_2019.sas: Mean expenditure per person, by event type and source of payment, 2019.
use_race_sex_2016.sas: Utilization and expenditures by race and sex, 2016
Codes provided in the older_exercises_1996_to_2006 folder include older SAS programs for analyzing earlier years of MEPS data. Each folder includes a SAS program (.sas) and SAS output (.pdf)
E1:
Person-level estimates (means, proportions, and totals) for healthcare expenditures, 2001
E2: Average total healthcare expenditures for children ages 0-5, 1996-1999
E3: Longitudinal estimates of insurance coverage and expenditures, 1999-2000
E4: Family-level estimates for healthcare expenditures, 2001
E5: Event-level expenditure estimates for hospital inpatient stays and office-based medical provider visits, 2001
E6: National health care expenditures by type of service, 2005 (Statistical Brief #193)
E7: Colonoscopy screening estimates, 2005 (Statistical Brief #188)
E8: Expenditures for inpatient stays by source of payment, per stay, per diem, with and without surgery, 2005
EM1: Relationship between health status and current main job weekly earnings, 2002
EM2: Determine how many people working at the beginning of the year changed jobs, 2002
L1: Merge the 2001 MEPS full-year file and the 2001 MEPS Jobs file
L1A: Combine the 2000 and 2001 MEPS Jobs files
L2: Link 2001 MEPS data with 1999 and 2000 NHIS data
L3: Merge 2001 MEPS Office-based Medical Provider Visits file with full-year file
L4: Merge 2001 MEPS Medical Conditions file with full-year file
L5: Merge 2001 MEPS Medical Conditions file with full-year file and various event files
M1: Demonstrates need for weight variables when analyzing MEPS data, 2005
M2: Demonstrates need for using the STRATUM and PSU variables when analyzing MEPS data, 2005
M3: Using ID variables to merge MEPS files, 2005
M4: Illustrates two ways to calculate the number of events associated with conditions. (1) using the evNUM variables on the CONDITIONS file. (2) using the number of matches between the CONDITIONS file and the CLINK file, 2003
M5: Demonstrates the difference between two uses of the term "priority condition" in MEPS, 2005
M6: Demonstrates use of the Diabetes Care Supplement (DCS) weight variable, 2006
M7: Person-level prescribed medicine expenditures for persons with at least one PMED event, 2003
M8: Prescribed medicine expenditures associated with cancer conditions, 2005
M9: Descriptive statistics of health insurance status and healthcare utilization, 2005
M10: Compares hospital inpatient expenditures (facility, physician, total) for stays that do and do not include facility expenditures for the preceding emergency room visit, 2003
M11: Merge parents' employment status variable to children's records, 2006