Code to reproduce analyses presented in Goudarzi et al
Most of the TAD and meQTL data used in the Jupyter Notebooks provided are present in the TAD_Data and meQTL_Data directories, respectively. However, the large meQTL data set with 1,236,142 meQTLs with unique rsid across all tumor types (in the code represented as unique_meqtls.csv or tad_meqtls.csv) exceeded storage limits. The data are available from Gong et al in the paper "Pancan-meQTL: a database to systematically evaluate the effects of genetic variants on methylation in human cancer". It is made from a simple aggregate of all of cis-meQTLs across all the tumor types, then filtered to unique rsids (with duplicates removed and only the first meQTL with the given rsid in the dataset retained). Our analysis used the location/CpG probe/affected gene of the meQTL data across 1,236,142 meQTLs with unique rsids.