Skip to content

Cenote-Taker 2 Version 2.1.2

Compare
Choose a tag to compare
@mtisza1 mtisza1 released this 21 Apr 19:23
· 91 commits to master since this release

If you haven't already installed Cenote-Taker 2, please follow installation instructions in README. If you have already installed it, please do:

conda activate cenote-taker2_env
conda install -c bioconda biopython bedtools
cd Cenote-Taker2
git pull

Then update the HMM database.
Thank you.

This release improves a number of things regarding the annotation and outputs of Cenote-Taker 2. Here is a fairly comprehensive list:

  1. BLASTN can be used to determine if your sequence belongs to an extant virus species based on 95% Average Nucleotide Identity (ANI) and 85% Alignment Fraction (AF), per community standards. This module requires GenBank nt database, GenBank virus nucleotide database, or some subset thereof. If a sequence has at least 95% ANI and 85% AF to a virus, the taxonomy/organism name will be changed to match the GenBank entry. This module uses anicalc.py from CheckV, see license and copyright in anicalc directory.
  2. ORFs that overlap tRNAs are now removed to comply with GenBank guidelines. ORFs that are cut off by the end of a contig are now properly formatted per GenBank guidelines.
  3. "Messy" gene names are largely improved to comply with GenBank guidelines.
  4. Organism/Taxonomy and BLASTN info are now included in the summary .tsv file
  5. Cenote-Taker 2 uses more refined gene content searches to identify putative conjugative transposons. Also, genes that Cenote-Taker 2 flags as conjugative machinery are output as a .gtf file in the sequin_and_genome_maps directory.
  6. Cenote-Taker 2 will now take a CRISPR spacer hit table as an optional input, and will put CRISPR spacer hit info in the note of the genome output files. The format required is a tab-separated table:
    CONTIG_NAME HOST_NAME NUMBER_OF_HITS
    e.g.
    my_contig_1 bacteroides 9

Best,

Mike