Skip to content

4. Running MetaCarvel

Jay Ghurye edited this page Nov 13, 2018 · 1 revision

Once you have prepared the input data, you can run MetaCarvel. These are the options to run MetaCarvel:

python run.py -h
usage: run.py [-h] -a ASSEMBLY -m MAPPING -d DIR [-r REPEATS] [-k KEEP]
              [-l LENGTH] [-b BSIZE] [-v VISUALIZATION]

MetaCarvel: A scaffolding tool for metagenomic assemblies

optional arguments:
  -h, --help            show this help message and exit
  -a ASSEMBLY, --assembly ASSEMBLY
                        assembled contigs
  -m MAPPING, --mapping MAPPING
                        mapping of read to contigs in bam format
  -d DIR, --dir DIR     output directory for results
  -r REPEATS, --repeats REPEATS
                        To turn repeat detection on
  -k KEEP, --keep KEEP  Set this to keep temporary files in output directory
  -l LENGTH, --length LENGTH
                        Minimum length of contigs to consider for scaffolding
                        in base pairs (bp)
  -b BSIZE, --bsize BSIZE
                        Minimum mate pair support between contigs to consider
                        for scaffolding
  -v VISUALIZATION, --visualization VISUALIZATION
                        To generate .db file for AsmViz visualization program

You can tune multiple options to run MetaCarvel. In the most simple mode, you can run MetaCarvel as:

python run.py -a contigs.fasta -m alignment.bam -d output 

There are multiple options you can tune for your data. Suppose you want MetaCarvel to find repeats while scaffolding, you can run MetaCarvel as follows:

python run.py -a contigs.fasta -m alignment.bam -d output -r true

To test if the software as installed correctly, you can download the test data. You can use these contigs.fasta and alignment.bam files for running MetaCarvel and check if it works. Running with these files should not take more than 5 minutes. Please make sure you have samtools and bedtools accessible and have an appropriate version of NetworkX before running test dataset. If the code fails to run because of any issue, we recommend removing the output folder and rerunning it from scratch.

If you want to limit the minimum mate pair support for each link in the graph, then you would set -b option. By default, we use 3 mate pairs to construct bundles of links. If you want to keep the intermediate files generated by MetaCarvel, you would set -k true. If you want to limit the minimum contig size (in base pairs) to be considered for scaffolding, you would use -l option. If you want to generate the visuazlitation file using MetaCarvel graphs, you would need to run the following command:

python run.py -a contigs.fasta -m alignment.bam -d output -r true -k true -v true

-v true option makes sure to run MetagenomeScope after MetaCarvel is finished generating scaffolds.

Clone this wiki locally