This version is now suitable to cope with all ChewBBACA outputs.
- Fixed bug where
INF-xxx
=>-xxx
but subsequent alleles didn't includeINF-x
and were called asxxx
not-xxx
. - Set all non numeric alleles to
0
.
The above is implemented by:
- specifically replaced
PLOT3
andPLOT5
with spaces - removing all
other A-Z
chars from the input line - taking the
abs()
oi all inferred allleles
For a data set of 14319 samples x 3016 allles it runs in 12 minutes on a single thread Xeon from 6 years ago.
This is ~120,000 profile/vector comparisons per seoond!