-
Notifications
You must be signed in to change notification settings - Fork 0
Proposed Analysis: Annotate SNV table of mutation frequencies #64
Comments
Hi @jharenza . Thank you for the analysis description. I have a few questions about how to generate the columns in the "Example Somatic Mutation Table" sheet of OT_SomaticTables_SNV_CNV.xlsx.
I will work on the frequencies first. |
Hi @logstar Let's tackle
Ah, good question. I hadn't recalled this field, but the way we would create this is within Then, the
I think you have this almost right; there would be no end position in the above identifier.
Yes, let me update the excel file- give me a few min.
yes |
@jharenza Thank you for the detailed reply. I will work on I agree it is more informative to have the revised columns. If I understand correctly, |
yes, where total mutations is total N of that specific mutation in the dataset |
Got it. Thank you for the quick reply. I will work on the analysis accordingly. |
Just wanted to make a note from the June 30th call (feel free to update/edit)
|
Thank you for the notes. I wonder where I can get the CosmicMutantExportCensus.tsv and RMTL table for annotation. Are they going to be available in future data releases? |
RMTL will be (soon) provided in v6 |
Got it. Thank you for the quick reply. |
I am having problems downloading the CosmicMutantExportCensus.tsv file - I cannot access via the Qiagen website as they suggest, so I submitted an email asking for help there. So, for now, proceed without this annotation. |
Squashed commit of the following: commit d87986b7ce1a517f4807430ce6beaac5950b50ca Author: logstar <y.will.zhang@gmail.com> Date: Wed Jun 30 17:21:59 2021 -0400 Rename mutation-frequencies to snv-frequencies Rename module. commit b2d2fd5c391b43214825e7b458d0edcb5ac22f1a Author: logstar <y.will.zhang@gmail.com> Date: Wed Jun 30 17:13:18 2021 -0400 Annotate SNV table with mutation frequencies Issues addressed: - <d3b-center/ticket-tracker-OPC#64> - <d3b-center/ticket-tracker-OPC#8>. This issue is no longer compatible with the purpose of this module. This module intends to compute mutation frequencies for each variant, but this issue intents to compute the mutation frequencies for each gene. This issue is listed here for future reference. commit 84cacf28927121037f4b9ba895e5baa5d12c7b31 Author: logstar <y.will.zhang@gmail.com> Date: Wed Jun 30 16:23:20 2021 -0400 [WIP] Update run-mutation-frequencies.sh commit 29ae8ef19f2339ae08f78c26ab42e6cf75d3556e Author: logstar <y.will.zhang@gmail.com> Date: Wed Jun 30 16:14:50 2021 -0400 [WIP] Generate annotated SNV frequency table commit 2cb06741ca192f77a3043d03574649a184459b11 Author: logstar <y.will.zhang@gmail.com> Date: Wed Jun 30 14:54:39 2021 -0400 [WIP] Replace NA with blank string Also replaced HotSpot value 1 with Y and 0 with N. commit 57776f61e576a5e3e2672370370fd1090f3aa478 Author: logstar <y.will.zhang@gmail.com> Date: Wed Jun 30 14:02:13 2021 -0400 [WIP] Use mygene.info to query gene IDs mygene.info seems to be actively maintained. The query results are more comprehensive than [biomaRt](https://bioconductor.org/packages/release/bioc/html/biomaRt.html). Relevant URLs: - <http://mygene.info/about> - <https://bioconductor.org/packages/release/bioc/html/mygene.html> mygene.info is suggested by @taylordm and @jharenza. commit 76bb0f5236378648adce429e45d3827009735b58 Author: logstar <y.will.zhang@gmail.com> Date: Wed Jun 30 10:55:30 2021 -0400 [WIP] Generate SNV frequency tables Issue addressed: d3b-center/ticket-tracker-OPC#64
@jharenza Is the following unfinished task included in the
If not, I could add the required annotation for the v7 annotator and |
No, this was to use the COSMIC mutation evidence rather than the genes. I never heard back from them, so we can add it as a future ticket and enhancement if we hear back. |
Got it. I think we could leave this ticket open as a reference. I will also submit two tickets for adding COSMIC mutation evidence to snv-frequencies and annotator, label them with blocked, and refer to this issue. |
Closing with PR45 merged. |
What are the scientific goals of the analysis?
Annotate the SNV TSV table of mutation frequencies per cohort+cancer group+primary/relapse as will be created in #8 for conversion to JSON format.
What methods do you plan to use to accomplish the scientific goals?
Annotate the table with headings as below:
OT_SomaticTables_SNV_CNV.xlsx
Much of this can be achieved by leveraging MAF fields corresponding to the exact variant calls. For ClinVar, we may need to download a version of the database.
Update June 30th
What input data are required for this analysis?
How long do you expect is needed to complete the analysis? Will it be a multi-step analysis?
2-3 days
Who will complete the analysis (please add a GitHub handle here if relevant)?
@logstar ?
What relevant scientific literature relates to this analysis?
The text was updated successfully, but these errors were encountered: