Skip to content

Releases: Jyonn/UnifiedTokenizer

3.0.12 Released

28 Mar 10:02
Compare
Choose a tag to compare

Fantastic features for UniTok 3.0!

UniDep Cache (from 2.4.3.2)

UniDep might suffer inefficiency when unioning other depots. Depot cache will generate samples all at once.

UniDep Export (from 3.0.11)

Easy to export unioned or filtered depot.

More Easy-to-use Vocab

  • support len(vocab) to get vocab size
  • support vocab iterating by for obj in vocab
  • support list(vocab) to get token list
  • support vocab.i2o(index) to get vocab by index, and vocab.o2i(obj) to get index by object

Two New Tokenizers

  • NumberTok
  • SeqTok

Compatible Meta

  • support print(depot) to get detailed description of depot
  • support meta upgrading

2.3.1.2 LTS Released

14 Sep 04:09
Compare
Choose a tag to compare

New features for UniTok 2.3.x Series:

  • optimize the Classify class which returns NoneClassify when the target dict path not exists
  • provide the pre-handler for tokenizers
  • provide GlobalSetting for selient mode (only for now)