A simple module that provides parser for different diagnostic variant information as part of the Molecular Tumor Board data provisioning in Tübingen.
File formats for
Python class info
- SnvParser
Parsing a SNV file following the formats described above is fairly simple. Just create an SnvParser
object with the path to the tsv
-file and specify its type by providing the correct header (SSnvHeader, GSnvHeader, ...
).
from mtbparser.snv_parser import SnvParser
from mtbparser.snv_utils import SSnvHeader
# Path to a valid SNV tsv file, as specified in
# the file format documentation.
somatic_snv_file = "/path/to/mySSnv.tsv"
# Create parser object for somatic SNVs
parser = SnvParser(somatic_snv_file, SSnvHeader)
# Iterate through parsed SNV items and get the gene name
for snv_item in parser.getSNVs():
print(snv_item.get_snv_info(SSnvHeader.GENE.name))
This code implementation was done at the Quantitative Biology Center. Please contact the author @sven1103 for more information.
Rename enums, so their names and values are equal.
You can now extract a copy of the complete SNV Item information as dictionary.
Bugfix: parsing of empty first columns failed because of the usage of strip() function. Columns can be empty, if they contain no information, so trimming whitespaces leads to a wrong total column number. We use rstrip() now, to remove escape characters and trailing whitespaces.
Fixed typo in 'tumor_content' definition in the metadata section.
Just v0.2.2, but had to rename the version on PyPi
Reading of files failed, because Python expects ACII encoding by default. Now, files are explicitely opened with utf-8 encoding.
Installation with pip
failed, because the DESCRIPTION.rst for the module description was not provided in the sdist package.
The first production-ready version of mtbparser
, that is also deployed on PyPI. Only updated documentation and proper formatting for PyPI deployment.
A first develeopment version, should work, tests are good, coverage > 90%.