Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update WGS & WXS independent specimens #597

Open
rjcorb opened this issue Jul 10, 2024 · 5 comments
Open

Update WGS & WXS independent specimens #597

rjcorb opened this issue Jul 10, 2024 · 5 comments

Comments

@rjcorb
Copy link

rjcorb commented Jul 10, 2024

What data file(s) does this issue pertain to?

independent-specimens.wgswxspanel.primary-plus.prefer.wgs.tsv

What release are you using?

v15

Put your question or report your issue here.

The following DNA-seq BS IDs should be added to patients, since they are the only DNA-seq samples available:

PT_YCMH9SQP: BS_A70G7S2W
PT_EV71W1JW: BS_YETTZ1NC

@komalsrathi
Copy link

Quick question, why specifically only those two samples? It seems there are total 11 participants that only have Metastatic samples (4 Targeted Sequencing and 7 WGS samples):

# Filter to only DNA samples from tumors, where composition is not "Derived Cell Line" and "PDX", and are not metastatic
tumor_samples <- histology_df %>%
  dplyr::filter(sample_type == "Tumor",
                !composition %in% c("Derived Cell Line", "PDX"),
                is.na(RNA_library),
                experimental_strategy %in% c("WGS", "WXS", "Targeted Sequencing"),
                !grepl("Metastatic secondary tumors", pathology_diagnosis, ignore.case = FALSE, perl = FALSE,
                       fixed = FALSE, useBytes = FALSE))

# Filter to participants with only Metastatic samples 
tumor_samples_only_met <- histology_df %>%
  dplyr::filter(sample_type == "Tumor", 
                !composition %in% c("Derived Cell Line", "PDX"), 
                is.na(RNA_library), 
                experimental_strategy %in% c("WGS", "WXS", "Targeted Sequencing"),
                grepl("Metastatic secondary tumors", pathology_diagnosis, ignore.case = FALSE, perl = FALSE,
                       fixed = FALSE, useBytes = FALSE))
tumor_samples_only_met <- tumor_samples_only_met %>%
  filter(!Kids_First_Participant_ID %in% tumor_samples$Kids_First_Participant_ID)

> unique(tumor_samples_only_met$Kids_First_Participant_ID) %>% length()
[1] 9

> unique(tumor_samples_only_met$Kids_First_Participant_ID)
[1] "PT_EV71W1JW" "PT_S9M3JJVB" "PT_E6BGSP51" "PT_XN1P30ZC" "PT_087EW14F" "PT_QH6X1C3A" "PT_YCMH9SQP" "PT_GKHDNKMW" "PT_KXQR3GS4"

# check the tumor_samples_only_met biospecimens in histology file
histology_df %>%
  filter(Kids_First_Biospecimen_ID %in% tumor_samples_only_met$Kids_First_Biospecimen_ID) %>%
  dplyr::select(Kids_First_Participant_ID, Kids_First_Biospecimen_ID, sample_type, composition, experimental_strategy, RNA_library, pathology_diagnosis) %>%
  arrange(experimental_strategy)

# there are 4 Targeted Sequencing and 7 WGS samples

   Kids_First_Participant_ID Kids_First_Biospecimen_ID sample_type composition   experimental_strategy RNA_library pathology_diagnosis        
   <chr>                     <chr>                     <chr>       <chr>         <chr>                 <chr>       <chr>                      
 1 PT_E6BGSP51               BS_NW8WV3D1               Tumor       Solid Tissue  Targeted Sequencing   NA          Metastatic secondary tumors
 2 PT_087EW14F               BS_X91E07CQ               Tumor       Not Available Targeted Sequencing   NA          Metastatic secondary tumors
 3 PT_S9M3JJVB               BS_BH45SCWY               Tumor       Not Available Targeted Sequencing   NA          Metastatic secondary tumors
 4 PT_GKHDNKMW               BS_H1E7ZSYG               Tumor       Not Available Targeted Sequencing   NA          Metastatic secondary tumors
 5 PT_EV71W1JW               BS_YETTZ1NC               Tumor       Solid Tissue  WGS                   NA          Metastatic secondary tumors
 6 PT_S9M3JJVB               BS_233JPDBD               Tumor       Solid Tissue  WGS                   NA          Metastatic secondary tumors
 7 PT_E6BGSP51               BS_7F3V5AKH               Tumor       Solid Tissue  WGS                   NA          Metastatic secondary tumors
 8 PT_XN1P30ZC               BS_5VEEG4JT               Tumor       Solid Tissue  WGS                   NA          Metastatic secondary tumors
 9 PT_QH6X1C3A               BS_083RF2ZE               Tumor       Solid Tissue  WGS                   NA          Metastatic secondary tumors
10 PT_YCMH9SQP               BS_A70G7S2W               Tumor       Solid Tissue  WGS                   NA          Metastatic secondary tumors
11 PT_KXQR3GS4               BS_551ZH7EV               Tumor       Solid Tissue  WGS                   NA          Metastatic secondary tumors

@rjcorb
Copy link
Author

rjcorb commented Jul 30, 2024

@komalsrathi I think only those two samples were included in a germline cohort we are working with, so I didn't realize there were others that are falling into the same category.

@komalsrathi
Copy link

Oh okay, then I think it is acceptable to include the 11 samples listed above? If you agree with it then I'll create a PR with the changes.

@jharenza
Copy link
Member

Oh okay, then I think it is acceptable to include the 11 samples listed above? If you agree with it then I'll create a PR with the changes.

Hey @komalsrathi - @rjcorb and I were chatting about this since we did not realize these were mets. I would like Jenn to look into these before we add - I am not sure if it makes sense to add these if they may not have initial tumors in the brain. Eg some are osteo, nbl - so I want to hear back from her about the initial diagnoses first. I guess if they are initial solid tumors, we possibly can shift them out of PBTA cohort and into the appropriate cohorts or figure out some other way to handle, while making sure they are still in the independent specimen list. So let's just pause on this one.

@komalsrathi
Copy link

Ok sure, I did rerun and push the results to a new branch but I will hold off on creating a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants