Skip to content

id2taxid mapping file format #766

Answered by bbuchfink
amirkarger asked this question in Q&A
Discussion options

You must be logged in to vote

Does diamond care about the accession column, the accession.version column, or both?

Only accession.version

If I have IDs that have periods in them that aren't related to a version, do I need to remove the period? What if the stuff after the period isn't a number? What if the IDs for two proteins are the same before the period, like abc.a1 and abc.a2?

It will always ignore everything after the last dot unless you use --no-parse-seqids.

Are there changes I need to make to FASTA headers too?

Probably not, for your use case it should be easier with --no-parse-seqids.

I forgot to add that using --no-parse-seqids didn't help, and in fact stopped the one single species that was mapping (…

Replies: 2 comments 4 replies

Comment options

You must be logged in to vote
1 reply
@amirkarger
Comment options

Comment options

You must be logged in to vote
3 replies
@amirkarger
Comment options

@bbuchfink
Comment options

@bbuchfink
Comment options

Answer selected by bbuchfink
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants