Skip to content

Latest commit

 

History

History
22 lines (12 loc) · 600 Bytes

README.md

File metadata and controls

22 lines (12 loc) · 600 Bytes

stammer

an Real English Segmentation tool| pure python

the goal is to make life easier. forget about pos tags and trees and so on.

###TO DO

As it is a toy tool for me, there still dicts and state probablity data needed

Also, maybe i will tranning code(state probablity part) into it.

the dict part can simply add some reasonable phare and idf score is fairy enouth

POS Tag is also under consideration.

###Thx to jieba project

this repo take advantage from it (https://github.com/fxsjy/jieba)

and yes, it is definitly the best segmentation module in chinese!