-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathmutspecNmf.xml
149 lines (116 loc) · 8.37 KB
/
mutspecNmf.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
<?xml version="1.0"?>
<tool id="mutSpecnmf" name="MutSpec NMF" version="0.0.1">
<description>Extract mutation signatures with the Non negative Matrix Factorization algorithm</description>
<requirements>
<requirement type="set_environment">SCRIPT_PATH</requirement>
<requirement type="package" version="5.18.1">perl</requirement>
<requirement type="package" version="3.2.1">R</requirement>
<requirement type="package" version="2.11.1">fontconfig</requirement>
<requirement type="package" version="1.7.1">numpy</requirement>
<requirement type="package" version="0.1">mutspec</requirement>
</requirements>
<command interpreter="Rscript --no-save">
R/somaticSignature_Galaxy.r --nbSign $nbsign --cpu 8 --output ${html_file.files_path} --html $html_file
#if $refGenomeSource.source == "html":
--input ${refGenomeSource.reportHTML.extra_files_path}/Mutational_Analysis/Figures/Input_NMF/Input_NMF_Count.txt
#else
--input ${refGenomeSource.matrix}
#end if
</command>
<inputs>
<conditional name="refGenomeSource">
<param name="source" type="select" label="Input a MutSpec Stats report or a matrix" help="You may select either a report generated by MutSpec-Stats or a tab-delimited text matrix">
<option value="html">Dataset generated by the tool MutSpec-Stats</option>
<option value="tab">Tab-delimited matrix</option>
</param>
<when value="html">
<param name="reportHTML" type="data" format="html" label="Input dataset" help="Select a report generated by the MutSpec-Stats tool"/>
</when>
<when value="tab">
<param name="matrix" type="data" format="tabular" label="Input matrix" help="Select a matrix formatted as shown further below"/>
</when>
</conditional>
<param name="nbsign" type="text" value="2" label="Number of expected signatures" help="min=2" />
</inputs>
<outputs>
<data name="html_file" format="html" label="NMF result on ${on_string} ($nbsign signatures)" />
</outputs>
<stdio>
<regex match="Error" source="stderr" level="fatal" description="Read error message for more details" />
<regex match="Unable to create the directory" source="stderr" level="fatal" description="Unable to create the directory" />
<regex match="Permission denied" source="stderr" level="fatal" description="Read error message for more details" />
</stdio>
<help>
**What it does**
Extract mutation signatures composed of 96 SBS types (6 SBS types in their trinucleotide sequence context) using the non-negative matrix (`NMF`__) factorisation algorithm of Brunet with the Kullback-Leibler divergence penalty implemented in a `R package`__.
.. __: http://www.nature.com/nature/journal/v401/n6755/full/401788a0.html
.. __: http://www.biomedcentral.com/1471-2105/11/367
--------------------------------------------------------------------------------------------------------------------------------------------------
**Input formats**
The tool accepts a HTML report produces by the tool MutSpec-Stats or a matrix of mutation count in a tab-delimited text file format (see example below).
.. class:: warningmark
If the input is a matrix of mutation count, the sum of mutation counts for each row should be not null.
--------------------------------------------------------------------------------------------------------------------------------------------------
**Output**
Matrices and graphs representing the composition of the mutation signatures found by NMF (Matrix W) and the contributions of each sample to the signatures (Matrix H). The tool also produces a matrice that can be used with the tool MutSpec-compare for comparing the identified signatures with known signatures.
--------------------------------------------------------------------------------------------------------------------------------------------------
**Example: matrix of mutation count (96 rows + a header with the samples names)**
+--------+----------+----------+----------+
| | Sample_1 | Sample_2 | Sample_3 |
+========+==========+==========+==========+
|A[C>A]A | 4 | 3 | 1 |
+--------+----------+----------+----------+
|A[C>T]A | 2 | 1 | 0 |
+--------+----------+----------+----------+
|A[C>G]A | 13 | 2 | 1 |
+--------+----------+----------+----------+
|A[T>A]A | 10 | 3 | 6 |
+--------+----------+----------+----------+
|A[T>C]A | 9 | 6 | 1 |
+--------+----------+----------+----------+
|A[T>G]A | 2 | 1 | 0 |
+--------+----------+----------+----------+
| ... |
+--------+----------+----------+----------+
|T[C>A]T | 5 | 2 | 2 |
+--------+----------+----------+----------+
|T[C>G]T | 5 | 2 | 0 |
+--------+----------+----------+----------+
|T[C>T]T | 11 | 4 | 2 |
+--------+----------+----------+----------+
|T[T>A]T | 3 | 0 | 5 |
+--------+----------+----------+----------+
|T[T>C]T | 39 | 17 | 1 |
+--------+----------+----------+----------+
|T[T>G]T | 12 | 8 | 1 |
+--------+----------+----------+----------+
--------------------------------------------------------------------------------------------------------------------------------------------------
**Contact**
ardinm@fellows.iarc.fr; cahaisv@iarc.fr
--------------------------------------------------------------------------------------------------------------------------------------------------
**Code**
The source code is available on `GitHub`__
.. __: https://github.com/IARCbioinfo/mutspec.git
</help>
<citations>
<citation type="bibtex">
@article{ardin_mutspec:_2016,
title = {{MutSpec}: a Galaxy toolbox for streamlined analyses of somatic mutation spectra in human and mouse cancer genomes},
volume = {17},
issn = {1471-2105},
doi = {10.1186/s12859-016-1011-z},
shorttitle = {{MutSpec}},
abstract = {{BACKGROUND}: The nature of somatic mutations observed in human tumors at single gene or genome-wide levels can reveal information on past carcinogenic exposures and mutational processes contributing to tumor development. While large amounts of sequencing data are being generated, the associated analysis and interpretation of mutation patterns that may reveal clues about the natural history of cancer present complex and challenging tasks that require advanced bioinformatics skills. To make such analyses accessible to a wider community of researchers with no programming expertise, we have developed within the web-based user-friendly platform Galaxy a first-of-its-kind package called {MutSpec}.
{RESULTS}: {MutSpec} includes a set of tools that perform variant annotation and use advanced statistics for the identification of mutation signatures present in cancer genomes and for comparing the obtained signatures with those published in the {COSMIC} database and other sources. {MutSpec} offers an accessible framework for building reproducible analysis pipelines, integrating existing methods and scripts developed in-house with publicly available R packages. {MutSpec} may be used to analyse data from whole-exome, whole-genome or targeted sequencing experiments performed on human or mouse genomes. Results are provided in various formats including rich graphical outputs. An example is presented to illustrate the package functionalities, the straightforward workflow analysis and the richness of the statistics and publication-grade graphics produced by the tool.
{CONCLUSIONS}: {MutSpec} offers an easy-to-use graphical interface embedded in the popular Galaxy platform that can be used by researchers with limited programming or bioinformatics expertise to analyse mutation signatures present in cancer genomes. {MutSpec} can thus effectively assist in the discovery of complex mutational processes resulting from exogenous and endogenous carcinogenic insults.},
pages = {170},
number = {1},
journaltitle = {{BMC} Bioinformatics},
author = {Ardin, Maude and Cahais, Vincent and Castells, Xavier and Bouaoun, Liacine and Byrnes, Graham and Herceg, Zdenko and Zavadil, Jiri and Olivier, Magali},
date = {2016},
pmid = {27091472},
keywords = {Galaxy, Mutation signatures, Mutation spectra, Single base substitutions}
}
</citation>
</citations>
</tool>