Skip to content

Releases: shenwei356/taxonkit

TaxonKit v0.12.0-alpha

31 May 14:13
Compare
Choose a tag to compare
Pre-release

Changes

  • taxonkit create-taxdump:
    • accepts arbitrary ranks #60
    • better handle of taxa with same names.
    • many flags changed.

TaxonKit v0.11.1

20 May 06:42
Compare
Choose a tag to compare

Changes

TaxonKit v0.11.0

16 May 02:45
Compare
Choose a tag to compare
  • TaxonKit v0.11.0 Github Releases (by Release)
    • new command taxonkit create-taxdump: Create NCBI-style taxdump files for custom taxonomy, e.g., GTDB and ICTV. #56

v0.11.0-alpha

21 Apr 05:53
Compare
Choose a tag to compare
v0.11.0-alpha Pre-release
Pre-release

Changes

  • new command taxonkit create-taxdump: Create NCBI-style taxdump files for custom taxonomy, e.g., GTDB. #56

Usage:

Create NCBI-style taxdump files for custom taxonomy, e.g., GTDB

Input format: 
  0. For GTDB taxonomy file, just use --gtdb
  1. The input file should be tab-delimited
  2. At least one column is needed, please specify the filed index:
     1) Kingdom/Superkingdom/Domain,     -K/--field-kingdom
     2) Phylum,                          -P/--field-phylum
     3) Class,                           -C/--field-class
     4) Order,                           -O/--field-order
     5) Family,                          -F/--field-family
     6) Genus,                           -G/--field-genus
     7) Species (needed),                -S/--field-species
     8) Subspecies,                      -T/--field-subspecies
        For GTDB, we use the assembly accession (without version number).
  3. The column containing the genome/assembly accession is recommended to
     generate TaxId mapping file (taxid.map, id -> taxid).
     -A/--field-accession,    field contaning genome/assembly accession        
     --field-accession-re,    regular expression to extract the accession 

Attentions:
  1. Names should be distinct in taxa of different rank.
     But for these missing some taxon nodes, using names of parent nodes is allowed:

       GB_GCA_018897955.1      d__Archaea;p__EX4484-52;c__EX4484-52;o__EX4484-52;f__LFW-46;g__LFW-46;s__LFW-46 sp018897155

     It can also detect duplicate names with different ranks, e.g.,
     The Class and Genus have the same name B47-G6, and the Order and Family between them have different names.
     In this case, we reassign a new TaxId by increasing the TaxId until it being distinct.

       GB_GCA_003663585.1      d__Archaea;p__Thermoplasmatota;c__B47-G6;o__B47-G6B;f__47-G6;g__B47-G6;s__B47-G6 sp003663585

Usage:
  taxonkit create-taxdump [flags] 

Flags:
  -A, --field-accession int         field index of assembly accession (genome ID), for outputting taxid.map
      --field-accession-re string   regular expression to extract assembly accession (default
                                    "^\\w\\w_(.+)$")
  -C, --field-class int             field index of class
  -F, --field-family int            field index of family
  -G, --field-genus int             field index of genus
  -K, --field-kingdom int           field index of kingdom
  -O, --field-order int             field index of order
  -P, --field-phylum int            field index of phylum
  -S, --field-species int           field index of species (needed)
  -T, --field-subspecies int        field index of subspecies
      --force                       overwrite existed output directory
      --gtdb                        input files are GTDB taxonomy file
      --gtdb-re-subs string         regular expression to extract assembly accession as the subspecies
                                    (default "^\\w\\w_GC[AF]_(.+)\\.\\d+$")
  -h, --help                        help for create-taxdump
      --line-chunk-size int         number of lines to process for each thread, and 4 threads is fast
                                    enough. (default 5000)
      --null strings                null value of taxa (default [,NULL,NA])
  -x, --old-taxdump-dir string      taxdump directory of older version
      --out-dir string              output directory
      --rank-names strings          names of the 8 ranks, order maters (default
                                    [superkingdom,phylum,class,order,family,genus,species,no rank])

TaxonKit v0.10.1

25 Feb 08:35
Compare
Choose a tag to compare

Changes

  • TaxonKit v0.10.1 Github Releases (by Release)
    • taxonkit cami2-filter: fix option --show-rank which did not work in v0.10.0.

TaxonKit v0.10.0

22 Feb 02:49
Compare
Choose a tag to compare

Changes

  • TaxonKit v0.10.0 Github Releases (by Release)

    • new command taxonkit cami2-filter: Remove taxa of given TaxIds and their descendants in CAMI metagenomic profile
    • taxonkit reformat: fix panic for deleted taxid using -F/--fill-miss-rank . #55

TaxonKit v0.9.0

01 Dec 05:32
Compare
Choose a tag to compare

Changes

  • TaxonKit v0.9.0 Github Releases (by Release)
    • new command taxonkit profile2cami: converting metagenomic profile table to CAMI format

TaxonKit v0.8.0

09 Apr 12:07
Compare
Choose a tag to compare

Changes

  • TaxonKit v0.8.0 Github Releases (by Release)
    • taxonkit reformat:
      • accept input of TaxIds via flag -I/--taxid-field.
      • accept single taxonomy names.
      • show warning message for TaxIds with the same lineage. #42
      • better flag checking. #40
    • taxonkit lca:
      • slightly speedup.
    • taxonkit genautocomplete:
      • support bash|zsh|fish/powershell

TaxonKit v0.7.2

26 Jan 14:47
Compare
Choose a tag to compare

Changelog

  • TaxonKit v0.7.2 Github Releases (by Release)
    • taxonkit lineage:
      • new flag -R/--show-lineage-ranks for appending ranks of all levels.
      • reduce memory occupation and slightly speedup.
    • taxonkit filter:
      • flag -E/--equal-to supports multiple values.
      • new flag -n/--save-predictable-norank: do not discard some special ranks without order when using -L, where rank of the closest higher node is still lower than rank cutoff.
    • taxonkit reformat:
      • new placeholder {t} for subspecies/strain, {T} for strain. Thanks @wqssf102 for feedback.
      • new flag -S/--pseudo-strain for using the node with lowest rank as strain name, only if which rank is lower than "species" and not "subpecies" nor "strain".

TaxonKit v0.7.1

24 Jan 01:50
Compare
Choose a tag to compare

Changelog

  • TaxonKit v0.7.1 Github Releases (by Release)
    • taxonkit filter:
      • disable unnecessary stdin check when using flag --list-order or --list-ranks. #36
      • better handling of black list, empty default value: "no rank" and "clade". And you need use -N/--discard-noranks to explicitly filter out "no rank", "clade". #37
      • update help message. Thanks @standage for improve this command! #38