Skip to content
Phelim Bradley edited this page Jan 20, 2016 · 2 revisions

McCortex will (should) never output a graph with `dangling edges' (edges to kmers that don't exist).

List kmers in FASTA format

mccortex31 view -q -k kmers.ctx | awk 'BEGIN{k=0}{print ">k"k; print $1; k+=1}'

Example output:

>k0
ACAGGATCTCAACCCACAGACTGCGGAGGCT
>k1
AGTTCGGTAGCTCCAATCATTGCGAGGTTAG
>k2
CCTTTCAGGGCGGCAAGCTACGGTTACCTGA

Select a subgraph

Get all kmers within 10 kmers of the kmers specified in kmers.to.remove.fa:

mccortex31 subgraph --out out.ctx --dist 10 --seq kmers.to.remove.fa in.ctx

Remove given kmers from a graph

Remove a set of kmers from a graph, where the kmers to remove are in the file kmers.to.remove.fa, write output to out.ctx:

mccortex31 subgraph --out out.ctx --invert --seq kmers.to.remove.fa in.ctx

If the kmers to remove are themselves in a graph rather than FASTA/FASTQ/BAM format, you can extract them like so:

mccortex31 view -q -k kmers.ctx | awk '{print $1}' | mccortex31 subgraph --out out.ctx --invert --seq - in.ctx

Get the intersection of multiple graphs

Get kmers in in.ctx that are in both a.ctx and b.ctx, write to out.ctx

mccortex31 join --out out.ctx --intersect a.ctx --intersect b.ctx in.ctx

Coverage/colour information from a.ctx and b.ctx will not be merged into the output file out.ctx.

Clone this wiki locally