GitHub

VERY DEPRECATED

Use eggNOG emapper.py scripts instead

This is a process for sorting a whole proteome into Hmm profiles using hmmscan. outputs:

A flat file of the best hit protein - HMM profile matches with annotations (tophit)
A flat file of all protein - HMM profile matches with annotations (annotated)
A flat file of all protein - HMM matches (all)
A flat file of proteins with no HMM hit (nohit)
The input fasta annotated with the EggNOG hmm group annotations. This is output to the proteomes/[species] folder

Instructions 1.Place the annotation and hmm files for a phylogenetic level in hmms/

ex. euNOG_hmm.tar.gz euNOG.annotations.tsv.gz

2.From the main directory run: bash masterscripts/startPress.sh [level]

ex. bash masterscripts/startPress.sh euNOG

This step takes about 10 TACC minutes

Make a directory for the species that you want to run in proteomes/ ex. mkdir proteomes/arath
Place the species' fasta in its folder in proteomes/ ex. proteomes/arath/uniprot-proteome%3AUP000006548.fasta
After the hmms are pressed, from the main directory run: bash masterscripts/startHmmscan.sh [species] [proteome] [level]

ex. bash master_scripts/startHmmscan.sh arath proteomes/arath/uniprot-proteome%3AUP000006548.fasta euNOG

This step takes up to 20 TACC hours depending on proteome/hmm profile count

tophit + nonhits are combined to create look ups for the othology mass spec analysis

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
hmms		hmms
master_scripts		master_scripts
output_data		output_data
proteomes		proteomes
scripts		scripts
README.md		README.md
start.sh		start.sh
startformatting.sh		startformatting.sh

Provide feedback