Type-Supervised Tagging: EMNLP 2012

This repository contains the code, scripts, and instructions needed to reproduce the results in the paper

Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries
Dan Garrette and Jason Baldridge
In Proceedings of EMNLP 2012

This code is frozen as of the version used to obtain the results in the paper. It will not be maintained.

To see the most up-to-date version of the code, visit this repository.

Running the experiments

Set up English data

The English experiments rely on Penn Treebank data. This script prepares that data for use by the experiments. The treebank directory referenced when running the script should contain a folder combined containing files wsj_0000.mrg through wsj_2454.mrg.

sh run.sh "en-data /path/to/treebank"

Run English experiments on sections 00-15

sh run.sh en-run16

Run English experiments on sections 00-07

sh run.sh en-run8

Run Italian experiments

The Italian data is already located in the data directory, so this experiment can be launched immediately without need for data setup.

sh run.sh it-run

Questions

If you have any questions, please contact Dan Garrette (dhg@cs.utexas.edu).

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
src/main		src/main
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt
run.sh		run.sh
sbt-launch-0.11.2.jar		sbt-launch-0.11.2.jar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Type-Supervised Tagging: EMNLP 2012

Running the experiments

Questions

About

Releases

Packages

Languages

License

dhgarrette/type-supervised-tagging-2012emnlp

Folders and files

Latest commit

History

Repository files navigation

Type-Supervised Tagging: EMNLP 2012

Running the experiments

Questions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages