Skip to content

Hidden Markov Model for Part-of-Speech tags

Notifications You must be signed in to change notification settings

AdiThakur/pos-tagger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 

Repository files navigation

Part-of-Speech Tagger

This python script uses Hidden Markov Models and the Viterbi algorithm to perform part-of-speech (POS) tagging on a given file. It can be invoked using the following command

python3 tagger.py -d <training files> -t <test file> -o <output file>

Input Format

<training files> are plain text files that have already been tagged. Such files consist of many lines, each of which is formatted as word : tag.

<test file> is a plain text file that you wish to tag. Each line of this file must contain exactly one word/punctuation character (whitespace should not be included). For example, the text "Hello, world?" would be formatted as

"
Hello
,
world
?
"

Output Format

The output format is exactly the same as the specification for the <training files>.

About

Hidden Markov Model for Part-of-Speech tags

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages