An implementation of Naive Bayes classifier (multinomial) used for recognition of emoticon in polish sentences.
Each sentence is preprocessed (lowecasing, tokenization, stop words removal and stemming ) with Natural Language Toolkit.
The training dataset has over 800 000 sentences with labels - sad or happy.