You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I periodically need to compare two fasta files which always turns out to be much harder than it should be. The order of sequences in a fasta file doesn't matter, nor does the formatted line length, so that rules out using 'cmp' and 'diff'.
If only the identifiers matter, one can do something like this, cmp <(grep '>' file1.fa | sort) <(grep '>' file2.fa | sort)
I don't know of an easy way to also compare the sequences without getting a ton of spurious differences. People often suggest perl or python hackery but I haven't found a tool that does what I want out-of-the-box.
I'm mostly interested in finding out if there are differences and in which sequence they occur in. See in the actual differences is less important for my application.
seqmagick seems like it might be a good place to put such a comparison tool. What do others think?
I'm happy to implement it if this makes sense.
The text was updated successfully, but these errors were encountered:
I periodically need to compare two fasta files which always turns out to be much harder than it should be. The order of sequences in a fasta file doesn't matter, nor does the formatted line length, so that rules out using 'cmp' and 'diff'.
If only the identifiers matter, one can do something like this,
cmp <(grep '>' file1.fa | sort) <(grep '>' file2.fa | sort)
I don't know of an easy way to also compare the sequences without getting a ton of spurious differences. People often suggest perl or python hackery but I haven't found a tool that does what I want out-of-the-box.
I'm mostly interested in finding out if there are differences and in which sequence they occur in. See in the actual differences is less important for my application.
seqmagick
seems like it might be a good place to put such a comparison tool. What do others think?I'm happy to implement it if this makes sense.
The text was updated successfully, but these errors were encountered: