Release Hazm 0.9 · roshan-research/hazm

Windows compaitiblity by using Python-crfsuite instead of Wapiti. @E-Ghafour.
Pretrained Chunker and POSTagger models with Python-crfsuite. @E-Ghafour.
new parameters in Normalizer to better text processing. @sir-kokabi.
Three regex patterns in Normalizer to fix ZWNJs and spacing issues. @sir-kokabi.
400 Non-standard unicode characters to be replaced in Normalizer. @sir-kokabi.
40,000+ new words to improve Lemmatizer and Tokenizer. @sir-kokabi.
train function for Word2vec and Sent2vec modules in Embedding. @E-Ghafour.
Implement keywordExtraction with the embedRank approach as a sample of Hazm usage. @E-Ghafour.
Support Universal tags in POSTagger. @E-Ghafour.
Support universal POS mapper in PeykareReader & DadeganReader (#239). @phsfr.
PersianPlainTextReader to process raw text datasets (#120). @mhbashari.
Support EZ tag in PeykareReader. @E-Ghafour.
Slash & back-slash (/ ) support in Tokenizer (#102). @elahimanesh.
Conjugation class to handle verb conjugation. @sir-kokabi.

Drop Python 2 support and migrate all code to Python 3. @sir-kokabi.
Use data_maker function instead of patterns in SequenceTagger. @E-Ghafour.
Refactor IOBTagger and POSTagger to be compatible with data_maker. @E_Ghafour.
Change می روم to می‌روم in example (#203). @SMSadegh19.
Overhaul the project structure and GitHub repo. @sir-kokabi.

Full Changelog: v0.8.2...v0.9

Provide feedback