0.1

Latest

Latest

trungtv released this 17 May 09:24

· 11 commits to master since this release

New features:

Retrain a new tokenization model on a much bigger dataset. F1 score =0.985
Add training data and training code
Better integration to spacy.io (removing redundant spaces between tokens after tokenization. Eg. Việt Nam ,
12 / 22 / 2020 => Việt Nam, 12/22/2020]

Assets 2