-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update CI testing data to include methyl matrices and gatk cn subsets #310
Conversation
@ewafula I downloaded the testing files and all md5sums check out. Merging dev back in - could have been a wonky thing with GA, so let's see what happens with a rerun. |
…to methyl-subset-matrices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized I did not push these comments..
analyses/create-subset-files/00-enrich-methyl-rnaseq-examples.Rmd
Outdated
Show resolved
Hide resolved
analyses/create-subset-files/00-enrich-methyl-rnaseq-examples.Rmd
Outdated
Show resolved
Hide resolved
analyses/create-subset-files/00-enrich-methyl-rnaseq-examples.Rmd
Outdated
Show resolved
Hide resolved
analyses/create-subset-files/00-enrich-methyl-rnaseq-examples.Rmd
Outdated
Show resolved
Hide resolved
CN modules are down to ~5 minutes 🎉 , but some other checks failed.. |
Co-authored-by: Jo Lynne Rokita <jharenza@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Purpose/implementation Section
What scientific question is your analysis addressing?
Update create-subset-files module to create methylation subset beta-values(
methyl-beta-values.rds
), m-values (methyl-m-values.rds
), cn-values (methyl-cn-values.rds
), and cnv-gatk(cnv-gatk.seg.gz
) for CI testing dataWhat was your approach?
1). Created a script to select a list of sample IDs from all three methylation matrices to create a CI testing subset datasets
2). Using the histologies file and independent samples lists, randomly selected a list of 5 methylation sample IDs from 850K (CBTN) and 450K (TARGET) arrays each and corresponding RNA-Seq samples for patients who have both datasets
3). Included the selected 10 methylation sample IDs in the subset data sets to enrich for methylation samples
4). Included the selected 10 RNA-Seq sample IDs in the subset data sets to enrich for RNA-Seq samples with methylation data
5). updated the
copy_number_consensus_call
and themethylation-summary
modules to read input data files from the releasedata/
directory6). Uploaded the updated CI testing subset files to the s3 bucket -
s3://d3b-openaccess-us-east-1-prd-pbta/open-targets/testing/
What GitHub issue does your pull request address?
Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.
Is there anything that you want to discuss further?
NA
Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?
YES
Results
What types of results are included (e.g., table, figure)?
CI test subset data files
What is your summary of the results?
d3b-center/ticket-tracker-OPC#493
Reproducibility Checklist
Documentation Checklist
README
and it is up to date.analyses/README.md
and the entry is up to date.