-
Notifications
You must be signed in to change notification settings - Fork 6
5 Scripts help page
Matteopaluh edited this page Nov 15, 2023
·
2 revisions
usage: set_kemet_working-directory.py [-h] [-k] [-u] [-G]
Base command for setting KEMET package working directory.
Create folders and instruction files; helper function to manage KEGG MODULE .kk database.
optional arguments:
-h, --help show this help message and exit
-k, --set_kk_DB
Choose this option to generate KEGG Module DB (.kk files),
in order to perform KEGG Modules Completeness evaluation.
Default: already generated
-u, --update_kk_DB
Choose this option to update already existing KEGG Module DB (.kk files).
-G, --gapfill_usage
Choose this option to create required folders for the GSMM Gapfilling,
follow-up of the HMM search procedures.
usage: add_taxonomy_from_gtdb-tk.py [-h] -i ADD_GENOMES_INSTRUCTION_FILE -t
ADD_GTDB_TO_NCBI_OUTPUT -f
{.fa,.fna,.fasta} [-v]
Add necessary taxonomy informations of MAGs/Genomes of interest for KEMET HMM and GSMM analyses.
Use this after the GTDB-tk "gtdb_to_ncbi_majority_vote.py" script (that converts from GTDB taxonomy to NCBI),
to further convert to KEGG BRITE taxonomy.
This script will include info on the MAGs/Genomes indicated in the output file from the aforementioned script.
IMPORTANT:
The automatic taxonomy conversion has notable exceptions, such as "Candidate" phyla,
as well as other phyla lacking sufficient (3+) KEGG Organism representatives.
optional arguments:
-h, --help show this help message and exit
-i ADD_GENOMES_INSTRUCTION_FILE, --add_genomes_instruction_file ADD_GENOMES_INSTRUCTION_FILE
Include the relative path to KEMET "genomes.instruction" file.
-t ADD_GTDB_TO_NCBI_OUTPUT, --add_gtdb_to_ncbi_output ADD_GTDB_TO_NCBI_OUTPUT
Include the relative path to "gtdb_to_ncbi_majority_vote.py" output file.
-f {.fa,.fna,.fasta}, --fasta_extension {.fa,.fna,.fasta}
Complete "genomes.instruction" file names with the indicated extension.
-v, --verbose Print more informations - for debug or log.
usage: kemet.py [-h] -a {eggnog,kaas,kofamkoala} [--update_taxonomy_codes]
[-I PATH_INPUT] [-k] [-n] [--skip_hmm]
[--hmm_mode {onebm,modules,kos}]
[--threshold_value THRESHOLD_VALUE] [--skip_nt_download]
[--skip_msa_and_hmmbuild] [--retry_nhmmer] [--skip_gsmm]
[--gsmm_mode {existing,denovo}] [-O PATH_OUTPUT] [-v] [-q]
[--log]
FASTA_file
KEMET - KEGG Module Evaluation Tool:
1) Evaluate KEGG Modules Completeness for given genomes.
2) HMM-based check for ortholog genes (KO) of interest after KEGG Module Completeness evaluation.
3) Genome-scale model gapfill with nucleotidic HMM-derived evidence, for KOs of interest.
positional arguments:
FASTA_file Genome/MAG FASTA file as indicated in the "genomes.instruction" -
points to files (in "KEGG_annotations") comprising KO annotations, associated with each gene.
optional arguments:
-h, --help show this help message and exit
-a {eggnog,kaas,kofamkoala}, --annotation_format {eggnog,kaas,kofamkoala}
Format of KO_list.
eggnog: 1 gene | many possible annotations;
kaas: 1 gene | 1 annotation at most;
kofamkoala: 1 gene | many possible annotations
--update_taxonomy_codes
Update taxonomy filter codes - WHEN TO USE: after downloading a new BRITE taxonomy with "set_kemet_working-directory.py".
-I PATH_INPUT, --path_input PATH_INPUT
Absolute path to input file(s) FOLDER.
-k, --as_kegg Return KEGG-Mapper output for the Module Completeness evaluation.
-n, --no_genome Avoid checking for MAG/genome FASTA file and only use annotations for Modules Completeness evaluation..
--skip_hmm Skip HMM-driven search for KOs & stop after KEGG Modules Completeness evaluation.
--hmm_mode {onebm,modules,kos}
Choose the subset of KOs of interest for HMM-based check.
By default, the KOs already present in the functional annotation are not checked further.
onebm: search for KOs from KEGG Modules missing 1 block;
modules: search for KOs from the KEGG Modules indicated in the "module_file.instruction" file, 1 per line
(e.g. Mxxxxx);
kos: search for KOs indicated in the "ko_file.instruction" file, 1 per line
(e.g. Kxxxxx)
--threshold_value THRESHOLD_VALUE
Define a threshold for the corrected score resulting from HMM-hits, which is indicative of good quality.
--skip_nt_download Skip downloading KEGG KOs nt sequences.
--skip_msa_and_hmmbuild
Skip MAFFT and HMMER hmmbuild commands.
--retry_nhmmer Move HMM-files and re-run nHMMER command.
--skip_gsmm Skip GSMM operations, gapfill or de-novo model creation, & stop after HMM-driven search for KOs.
--gsmm_mode {existing,denovo}
Choose the methods of GSMM operation.
(This method won't be performed if "--hmm_mode kos" was chosen)
existing: use pre-existing CarveMe GSMM to add reactions content connected to HMM-derived KOs;
denovo: generate a new CarveMe GSMM, performing gene prediction and adding HMM-derived hits from the chosen HMM-mode.
-O PATH_OUTPUT, --path_output PATH_OUTPUT
Absolute path to ouput file(s) FOLDER.
-v, --verbose Print more informations - for debug and progress.
-q, --quiet Silence soft-errors (for MAFFT and HMMER commands).
--log Store KEMET commands and progress during the execution in a log file.