Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug with taxonkit filter --black-list #37

Closed
4 tasks done
standage opened this issue Jan 22, 2021 · 12 comments
Closed
4 tasks done

Possible bug with taxonkit filter --black-list #37

standage opened this issue Jan 22, 2021 · 12 comments

Comments

@standage
Copy link

When user specifies a rank or a (comma separated?) list of ranks for --black-list, these should be excluded from the output, correct? I have tried the following example several times with different ranks, and I get the same error message every time.

$ echo 349741 | taxonkit lineage -t | cut -f 3 | sed 's/;/\n/g' > taxids2.txt
$ cat taxids2.txt | taxonkit filter -B Family 
23:40:47.905 [ERRO] rank order not defined in rank file: no rank

Is this a bug, or am I misunderstanding this flag?


Prerequisites

  • make sure you're are using the latest version by taxonkit version
  • read the usage

Describe your issue

  • describe the problem
  • provide a reproducible example
@shenwei356
Copy link
Owner

You have to leave the default value in the black list. I should have maken it clear.

cat taxids2.txt | taxonkit filter -B Family -B "no rank,clade"
2
74201
203494
48461
1647988
239934
239935
349741

@standage
Copy link
Author

standage commented Jan 22, 2021

For now, the Python bindings add "no rank" and "clade" to the blacklist automatically, and it works ok.

But if it is required to leave the default values in the blacklist, maybe the --black-list flag should append to the default list instead of replace it?

@shenwei356
Copy link
Owner

shenwei356 commented Jan 22, 2021

I read the code again, and fix the logic.

"no rank" and "clade" are already defined as ranks with no order in the ranks.txt, and they can be optional removed via -N--discard-noranks, -B/--blast-list can be used for adding more ranks to delete, it can also include "no rank".

  -B, --black-list strings   black list of ranks to discard, e.g., '"no rank", "clade"'
  -N, --discard-noranks      discard ranks without order, type "taxonkit filter --help" for details

The above command should be:

cat taxids2.txt | taxonkit filter -N -B Family
  1. Flag -L/--lower-than and -H/--higher-than are exclusive, and can be
     used along with -E/--equal-to which values can be different.
  2. A list of pre-ordered ranks is in ~/.taxonkit/ranks.txt, you can use
     your list by -r/--rank-file, the format specification is below.
  3. All ranks in taxonomy database should be defined in rank file.
  4. TaxIDs with no rank can be optionally discarded by -N/--discard-noranks.
  5. Futher ranks can be removed with black list via -B/--black-list.

@shenwei356
Copy link
Owner

@standage
Copy link
Author

Oh, ok. So if I want to specify -B family then the -N flag is required?

@shenwei356
Copy link
Owner

not required, it's optional. -N is just for removing "no rank", "clade".

@standage
Copy link
Author

If it's optional, why did my original command fail?

@shenwei356
Copy link
Owner

shenwei356 commented Jan 22, 2021

There was a bug, it's fixed now.

72d438b#diff-8e2def025044548b1e3afb01de909c09078d4e6e4b84b9efc8c56a73b6434b34L210

@standage
Copy link
Author

standage commented Jan 22, 2021

Oh ok. 🤓

I will test with the latest binary you posted.

@standage
Copy link
Author

Is it easy for you to create a Darwin AMD binary? Don't worry if it's inconvenient.

@shenwei356
Copy link
Owner

Oh, I wrongly uploaded arm64 binaries...

@standage
Copy link
Author

Ok, I understand now. And I confirmed that my original command works. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants