with notes on assembling with Trinity
Mikhail Matz, matz@utexas.edu
UT Austin, April 2015
This is a loose collection of scripts to annotate a de novo assembled transcriptome with:
- gene names
- GO terms
- KOG term
- KEGG terms
In addition, the script CDS_extractor_v2.pl uses blastx results to extract CDS regions and encoded protein sequences while correcting occasional frameshifts.
Also included are solutions to assess the transcriptome's quality metrics: contiguity and completeness.
Please see "annotating transcriptome.txt" file for detailed walkthorugh.