Tags: #pangenome #pggb #wfmash 🏠 home
[! info] Purpose All the genomes will be aligned against each other using wfmash. The resulting all-vs-all alignments are used afterwards by seqwish to build a pangenome. Since we have phased genomes presenting a high level of heterozygosity, the haplotypes are separated in distinct genomes.
https://github.com/pangenome/pggb https://github.com/waveygang/wfmash
The query is split by chromosome while the reference is an entire haplotype.
# prepare commands
for query in $(ls seq/query*fasta); do q_name=$(basename $query .fasta); for ref in $(ls seq/ref*fasta); do r_name=$(basename $ref .fasta); if [ "$q_name" == "$r_name" ]; then echo ""; else echo "/usr/bin/time -v -o logs/${q_name}.on.${r_name}.wfmash_s10000p85n1.timelog wfmash -t 1 -p 85 -s 10000 -n 1 $ref $query > wfmash/${q_name}.on.${r_name}.wfmash_s10000p85n1.paf 2> logs/${q_name}.on.${r_name}.wfmash_s10000p85n1.err"; fi; done; done > wfmash_commands.sh
# remove empty lines
sed -i '/^$/d' wfmash_commands.sh
# run
parallel -j 16 :::: wfmash_commands.sh
# merge
cat all*paf > all.on.all.wfmash_s10000p85n1.paf