From 84f800fbcff6a36cef8700621967993b35c47aba Mon Sep 17 00:00:00 2001 From: Emanuel Burgos Date: Wed, 16 Jun 2021 11:42:21 -0500 Subject: [PATCH] Final changes for v0.2.1 release (#79) * Final changes for 0.2.1 release * Increase version number: 0.2.0 -> 0.2.1 * Add new args of barcode_length and transposon_seq * Adding barcode_length and transposon_seq arguments Now the user can define the barcode index length from 4bp - 16bp and provide their own transposon sequence for regex searches as an argument to pyinseq * Add tests for new args * Disclaimer of future snakemake version --- CHANGELOG.md | 23 +++++++++++++++++------ README.md | 7 +++++++ mkdocs.yml | 2 +- pyinseq/runner.py | 4 ++-- 4 files changed, 27 insertions(+), 9 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 806cfab..7c25e21 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,15 +1,26 @@ # Changelog -## [Unreleased] +## [0.2.1] - 2021-06-02 ### Fixed - `pyinseq` alone brings up the help documentation -- Small fix to the `three_primeness` calculation. +- Small fix to the `three_primeness` calculation. + A minimum of 3 reads is now required per site, and a Left:Right max read ratio of 10-fold to be tallied. + ### Changed -- Only Python 3.6+ supported +- Only Python 3.6 and 3.7 are supported. +- `screed` module is used for opening/writing fastq files. + ### Added -- `pyinseq genomeprep` command -- Added T50 calculation -- Added progress bar for `demultiplex` function +- `pyinseq genomeprep` subcommand will prepare genome files for pyinseq run. Also checks GenBank files before running. +- Added T50 calculation for sites files. +- Added progress bar for `demultiplex` function and for `writing` reads to sample files. +- `test_script.py` now compares directories and files from `pyinseq` runs to the expected output. +- Parameter `--min_counts`: minimum number of reads at a site required to be tallied. Default is 3 +- Parameter `--max_ratio`: max ratio allowed between left and right reads around a TA insertion site. Default is 10-fold. +- Parameter `--transposon_seq`: define transposon sequence that is found at end of reads to help in demultiplexing. Default is ACAGGTTG +- Parameter `--barcode_length`: length of barcode index used to demultiplex reads into samples, allows for 4 - 16 nt. Default is 4. +- Parameter `--gff3`: enables `pyinseq` to write gff3 version of genome files. + ## [0.2.0] - 2017-07-16 ### Added diff --git a/README.md b/README.md index 38993a9..93e4cf9 100644 --- a/README.md +++ b/README.md @@ -2,8 +2,15 @@ ![Python 3.6](https://img.shields.io/badge/python-3.6-blue.svg) ![Python 3.7](https://img.shields.io/badge/python-3.7-blue.svg) + # pyinseq +### Disclaimer + +A new version of `pyinseq` that works with `snakemake` will be released on `bioconda` in the Summer of 2021 + +### About + Lightweight package to map transposon insertion sequencing (INSeq) data in bacteria. diff --git a/mkdocs.yml b/mkdocs.yml index a62aba0..d96e3cd 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,6 +1,6 @@ site_name: pyinseq theme: cinder -copyright: 2015-2018 Mark Mandel and Contributors +copyright: 2015-2021 Mark Mandel and Contributors nav: - Home: index.md diff --git a/pyinseq/runner.py b/pyinseq/runner.py index d0869eb..c2b11a1 100644 --- a/pyinseq/runner.py +++ b/pyinseq/runner.py @@ -109,12 +109,12 @@ def demultiplex_parse_args(args): ) parser.add_argument( "--barcode_length", - help="Length of the barcode which is used to demultiplex samples (4 - 16)", + help="Length of the barcode which is used to demultiplex samples (4 - 16 nt). Default is 4.", default=4, ) parser.add_argument( "--transposon_seq", - help="Sequence for the transposon that flanks reads", + help="Sequence for the transposon that flanks reads. Default is ACAGGTTG.", default="ACAGGTTG", ) return parser.parse_args(args)