Skip to content

Latest commit

 

History

History
27 lines (20 loc) · 1.22 KB

0.02_PGGB_wfmash.md

File metadata and controls

27 lines (20 loc) · 1.22 KB

Pangenome: PGGB wfmash

Tags: #pangenome #pggb #wfmash 🏠 home


[! info] Purpose All the genomes will be aligned against each other using wfmash. The resulting all-vs-all alignments are used afterwards by seqwish to build a pangenome. Since we have phased genomes presenting a high level of heterozygosity, the haplotypes are separated in distinct genomes.

https://github.com/pangenome/pggb https://github.com/waveygang/wfmash

The query is split by chromosome while the reference is an entire haplotype.

# prepare commands
for query in $(ls seq/query*fasta); do q_name=$(basename $query .fasta); for ref in $(ls seq/ref*fasta); do r_name=$(basename $ref .fasta); if [ "$q_name" == "$r_name" ]; then echo ""; else echo "/usr/bin/time -v -o logs/${q_name}.on.${r_name}.wfmash_s10000p85n1.timelog wfmash -t 1 -p 85 -s 10000 -n 1 $ref $query > wfmash/${q_name}.on.${r_name}.wfmash_s10000p85n1.paf 2> logs/${q_name}.on.${r_name}.wfmash_s10000p85n1.err"; fi; done; done > wfmash_commands.sh

# remove empty lines
sed -i '/^$/d' wfmash_commands.sh

# run
parallel -j 16 :::: wfmash_commands.sh

# merge
cat all*paf > all.on.all.wfmash_s10000p85n1.paf

home | Next Step -> seqwish