Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update mutational signatures module #300

Merged
merged 25 commits into from
Jun 16, 2023
Merged

Conversation

rjcorb
Copy link

@rjcorb rjcorb commented Dec 19, 2022

Purpose/implementation Section

What scientific question is your analysis addressing?

The mutational signatures analysis module was updated to generate signature weight matrices from the latest versions of multiple mutational signature databases.

What was your approach?

  • 01-known_signatures.Rmd was updated to generate per-sample mutational signature weight matrices of COSMIC (v2 and v3.3) and Nature single-base substitution (SBS) signatures.
  • 02-cosmic_dbs_signatures.Rmd generates signature weight matrices of COSMIC v3.3 double-base substitution (DBS) signatures.
  • 03-fit_cns_signatures.R determines the relative contributions of 8 known (adult) CNS signatures in the OpenPedCan data (considers both WXS and WGS) using deconstructSigs::whichSignatures().
  • The Rmd 04-explore_hypermutators.Rmd investigates whether hyper-mutant and/or ultra-hypermutant samples have enrichment of signatures 3, 18, and/or MMR.
    • We also asked whether these samples show dysregulated TP53.

What GitHub issue does your pull request address?

Ticket #453

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

  • Confirm that mutational signature weight matrices are generated as expected and in proper format.
  • Confirm that plots are generated as expected.

Is there anything that you want to discuss further?

  • This module currently does not run successfully due to dependencies not yet being installed in the Docker image. This will need to be resolved before the module can be run in full; for now, scripts 01, 02, and 04 can be run without errors.
  • Certain heat maps generated as part of OpenPBTA module (e.g., hyper mutant signature heatmaps) were not generated here do to color palette files not being included in OpenPedCan repo. However this code is still included and commented out, and can be run when palettes are added.

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

  • Yes

Results

What types of results are included (e.g., table, figure)?

  • {DATABASE}_signature_exposures.tsv files store a sample x signature weight matrix of mutational signatures from each database.
  • {DATABASE}_signature_results.tsv files store number of signature mutations per Mb for each sample and signature.
  • fitted_exposure_signal-cns-deconstructSigs.rds and deconstructSigs_cns_exposure_merged.tsv store CNS mutational signature results and results merged with histology data, respectively.
  • hypermutator_sig_matrix.tsv includes CNS mutational signature weights for samples defined as hypermutants.
  • sig_matrix_by_molecular_subtype.tsv stores CNS mutational signature weights and includes molecular subtype to assess TP53 mutant mutational signatures.
  • bubble_matrix_{DATABASE}_mutation_sig.png; plots showing prevalence of each mutational signature in each histology group by database queried.
  • {HISTOLOGY_GROUP}_{DATABASE}_mutation_sig.png; grouped bar charts showing mutational signature weights be histology group and database.
  • {BS_ID}_{DATABASE}_mutation_sig.png; per-sample barplots showing contributions of top 5 mutational signatures and error, by database queried.

What is your summary of the results?

  • Top COSMIC SBS mutational signatures among all tumors include those associated with clock-like signatures (SBS1+SBS5), ROS Damage (SBS18), Aflatoxin (SBS24), and Defective HR (SBS3).
  • Top COSMIC DBS mutational signatures among all tumors include those associated with POLE mutations (DBS3), Defective MMR (DBS7), and tobacco (DBS2).
  • Top CNS signatures among hypermutant samples include those associated with MMR (MMR2, 18).

Reproducibility Checklist

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

Documentation Checklist

  • This analysis module has a README and it is up to date.
  • This analysis is recorded in the table in analyses/README.md and the entry is up to date.
  • The analytical code is documented and contains comments.

@jharenza
Copy link
Member

jharenza commented Jan 13, 2023

Hi @rjcorb!

Based on some offline discussions with the Shlien lab, Sharon, and you - I think we should try the following approach for subsetting v3 mutational signatures pan-cancer:

  • Remove sequencing artifact signatures
  • Remove unknown signatures
  • Remove environmental exposure signatures
  • Remove Signature 39 (masks some small, but real effects of SBS3)

Please see the code here for the mapping file and example script for commands for signature inclusion.

@rjcorb
Copy link
Author

rjcorb commented Jan 17, 2023

@jharenza the analysis has been re-run with the filtered v3 signature subset. As expected, we see average SBS3 exposure levels increase in the filtered vs. unfiltered analysis (attached). Let me now if there are any other comparisons you'd like to see between runs.

cosmic_v3.3_exposures_filtered_vs_unfiltered.txt

@jharenza
Copy link
Member

jharenza commented May 4, 2023

@rjcorb let's add this as an agenda item to discuss next week

@rjcorb
Copy link
Author

rjcorb commented Jun 7, 2023

@jharenza the module has been re-run on v12 data and is ready for review. The deconstructSigs package needs to be updated to run the module in full, and a PR has been submitted to update the Dockerfile accordingly.

Copy link
Member

@jharenza jharenza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rjcorb the docker image is merged and this is ready for a rerun.

Two immediate things.

  1. analyses/mutational-signatures/plots/nature/bubble_matrix_nature_mutation_sig.png has non-signature rows
  2. I am wondering if we should save the images within a dataframe/rds file instead of committing them. I saw somewhere, but cannot find, where there is a way now to put images into a dataframe... just thinking of how to save some space. For now, perhaps you can put the broad histology plots in scratch?

analyses/mutational-signatures/01-known_signatures.Rmd Outdated Show resolved Hide resolved
@rjcorb
Copy link
Author

rjcorb commented Jun 16, 2023

@jharenza I have updated the module with requested changes

@jharenza jharenza self-requested a review June 16, 2023 14:32
@jharenza
Copy link
Member

going to merge this without a second GA, since the only merge since was docs

@jharenza jharenza merged commit 660cf72 into dev Jun 16, 2023
@jharenza jharenza deleted the update-mutational-signatures branch June 16, 2023 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants