Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add two samples' genomic data back to HOPE cohort #104

Open
jharenza opened this issue Jun 18, 2024 · 3 comments · May be fixed by #116
Open

Add two samples' genomic data back to HOPE cohort #104

jharenza opened this issue Jun 18, 2024 · 3 comments · May be fixed by #116
Assignees

Comments

@jharenza
Copy link
Member

Add 7316-1106 and 7316-3000 back to hope subcohort, except for proteomics. Use the attached file from Pei.
master_11032023 (1).txt

@komalsrathi
Copy link
Collaborator

komalsrathi commented Jun 20, 2024

I have added the two missing samples in the histology file (see attached). I can upload this file to s3://d3b-openaccess-us-east-1-prd-pbta/hope-aya/v3/Hope-GBM-histologies-base.tsv. I think I only need this file for creating the merged matrices per this module: https://github.com/d3b-center/hope-cohort-analysis/tree/master/analyses/merge-files. TSV is not allowed for uploads here on github so just zipped it:
Hope-GBM-histologies-base.tsv.zip

@jharenza
Copy link
Member Author

This looks good, yes, you can add to v3 folder, thank you

@komalsrathi
Copy link
Collaborator

I have updated and uploaded to s3 the following merged files:

results
├── Hope-cnv-controlfreec-tumor-only.rds
├── Hope-cnv-controlfreec.rds
├── Hope-fusion-putative-oncogenic.rds
├── Hope-gene-counts-rsem-expected_count-collapsed.rds
├── Hope-gene-counts-rsem-expected_count.rds
├── Hope-gene-expression-rsem-tpm-collapsed.rds
├── Hope-gene-expression-rsem-tpm.rds
├── Hope-snv-consensus-plus-hotspots.maf.tsv.gz
├── Hope-tumor-only-snv-mutect2.maf.tsv.gz
└── md5sum.txt

For the md5sum.txt, I have only updated the md5sums for the above files generated by my merge script).

Here is the comparison of sample size between v2 and the above merged files (i.e. v3) - each file's sample size has increased by 2:

> # Counts
> counts_file = readRDS("data/Hope-gene-counts-rsem-expected_count-collapsed.rds")
> length(colnames(counts_file))
[1] 85

> counts_file = readRDS("analyses/merge-files/results/Hope-gene-counts-rsem-expected_count-collapsed.rds")
> length(colnames(counts_file))
[1] 87

> # TPM
> tpm_file = readRDS("data/Hope-gene-expression-rsem-tpm-collapsed.rds")
> length(colnames(tpm_file))
[1] 85

> tpm_file = readRDS("analyses/merge-files/results/Hope-gene-expression-rsem-tpm-collapsed.rds")
> length(colnames(tpm_file))
[1] 87

> # SNV
> snv_file <- data.table::fread("data/Hope-snv-consensus-plus-hotspots.maf.tsv.gz")
> length(unique(snv_file$Tumor_Sample_Barcode))
[1] 71

> snv_file <- data.table::fread("analyses/merge-files/results/Hope-snv-consensus-plus-hotspots.maf.tsv.gz")
> length(unique(snv_file$Tumor_Sample_Barcode))
[1] 73

> # SNV tumor-only 
> snv_tumor_only_file <- data.table::fread("data/Hope-tumor-only-snv-mutect2.maf.tsv.gz")
> length(unique(snv_tumor_only_file$Tumor_Sample_Barcode))
[1] 88

> snv_tumor_only_file <- data.table::fread("analyses/merge-files/results/Hope-tumor-only-snv-mutect2.maf.tsv.gz")
> length(unique(snv_tumor_only_file$Tumor_Sample_Barcode))
[1] 90

> # CNV
> cnv_file <- readRDS("data/Hope-cnv-controlfreec.rds")
> length(unique(cnv_file$Kids_First_Biospecimen_ID))
[1] 71

> cnv_file <- readRDS("analyses/merge-files/results/Hope-cnv-controlfreec.rds")
> length(unique(cnv_file$Kids_First_Biospecimen_ID))
[1] 73

> # CNV tumor-only
> cnv_tumor_only_file <- readRDS("data/Hope-cnv-controlfreec-tumor-only.rds")
> length(unique(cnv_tumor_only_file$Kids_First_Biospecimen_ID))
[1] 88

> cnv_tumor_only_file <- readRDS("analyses/merge-files/results/Hope-cnv-controlfreec-tumor-only.rds")
> length(unique(cnv_tumor_only_file$Kids_First_Biospecimen_ID))
[1] 90

> # Fusions
> fusion_file <- readRDS("data/Hope-fusion-putative-oncogenic.rds")
> length(unique(fusion_file$Sample))
[1] 85

> fusion_file <- readRDS("analyses/merge-files/results/Hope-fusion-putative-oncogenic.rds")
> length(unique(fusion_file$Sample))
[1] 87

@komalsrathi komalsrathi linked a pull request Aug 1, 2024 that will close this issue
@komalsrathi komalsrathi linked a pull request Aug 1, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants