This repository is an unofficial collection of steps and benchmarks for Apache OpenNLP on data sets for common NLP tasks.
The purpose of this repository is to provide examples of how to train and evaluate Apache OpenNLP models and provide various performance data to help you determine if Apache OpenNLP is a good choice for your use-case.
For more information on the commands contained in these files, refer to the Apache OpenNLP documentation for the current version.
Dataset | Model Type |
---|---|
CoNLL03 | Apache OpenNLP model - TokenNameFinder for named person entities |
Large Movie Review Dataset | Apache OpenNLP model - Doccat for document classification (sentiment analysis) |