Text classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.
The AG News corpus consists of news articles from the AG's corpus of news articles on the web pertaining to the 4 largest classes. The dataset contains 30,000 training examples for each class 1,900 examples for each class for testing. Models are evaluated based on error rate (lower is better).
Model | Error | Paper / Source |
---|---|---|
ULMFiT (Howard and Ruder, 2018) | 5.01 | Universal Language Model Fine-tuning for Text Classification |
CNN (Johnson and Zhang, 2016) | 6.57 | Supervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings |
DPCNN (Johnson and Zhang, 2017) | 6.87 | Deep Pyramid Convolutional Neural Networks for Text Categorization |
VDCN (Alexis et al., 2016) | 8.67 | Very Deep Convolutional Networks for Text Classification |
Char-level CNN (Zhang et al., 2015) | 9.51 | Character-level Convolutional Networks for Text Classification |
The DBpedia ontology dataset contains 40,000 training samples and 5,000 testing samples for each of 14 nonoverlapping classes from DBpedia. Models are evaluated based on error rate (lower is better).
Model | Error | Paper / Source |
---|---|---|
ULMFiT (Howard and Ruder, 2018) | 0.80 | Universal Language Model Fine-tuning for Text Classification |
CNN (Johnson and Zhang, 2016) | 0.84 | Supervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings |
DPCNN (Johnson and Zhang, 2017) | 0.88 | Deep Pyramid Convolutional Neural Networks for Text Categorization |
VDCN (Alexis et al., 2016) | 1.29 | Very Deep Convolutional Networks for Text Classification |
Char-level CNN (Zhang et al., 2015) | 1.55 | Character-level Convolutional Networks for Text Classification |
The TREC dataset is dataset for question classification consisting of open-domain, fact-based questions divided into broad semantic categories. It has both a six-class (TREC-6) and a fifty-class (TREC-50) version. Both have 4,300 training examples, but TREC-50 has finer-grained labels. Models are evaluated based on accuracy.
TREC-6:
Model | Error | Paper / Source |
---|---|---|
ULMFiT (Howard and Ruder, 2018) | 96.4 | Universal Language Model Fine-tuning for Text Classification |
LSTM-CNN (Zhou et al., 2016) | 96.1 | Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling |
TBCNN (Mou et al., 2015) | 96.0 | Discriminative Neural Sentence Modeling by Tree-Based Convolution |
CoVe (McCann et al., 2017) | 95.8 | Learned in Translation: Contextualized Word Vectors |
TREC-50:
Model | Error | Paper / Source |
---|---|---|
Rules (Madabushi and Lee, 2016) | 97.2 | High Accuracy Rule-based Question Classification using Question Syntax and Semantics |
SVM (Van-Tu and Anh-Cuong, 2016) | 91.6 | Improving Question Classification by Feature Extraction and Selection |