Skip to content

Latest commit

 

History

History
10 lines (8 loc) · 504 Bytes

README.md

File metadata and controls

10 lines (8 loc) · 504 Bytes

LANGUAGE DETECTION (Bigram model)

DESCRIPTION:

Program inputs a corpus of text documents written in different languages. It automatically detects the language of a new given text in the form of a paragraph, sentence, word, or a few letters. The bigram letter model is used with some basics of probability.

TEST:

To test examples, run the language_detection.py and input the number of example you want to test.

DATASET:

Dataset can be found in publicDataSet/public/set folder