Skip to content

Latest commit

 

History

History
253 lines (227 loc) · 8.09 KB

named_entity_recognition.md

File metadata and controls

253 lines (227 loc) · 8.09 KB

Named Entity Recognition

Named entity recognition (NER) is the task of tagging entities in text with their corresponding type. Approaches typically use BIO notation, which differentiates the beginning (B) and the inside (I) of entities. O is used for non-entity tokens.

Example:

Mark Watney visited Mars
B-PER I-PER O B-LOC

Contents

VLSP 2018 Shared Task: Named Entity Recognition

The size of VLSP 2018 dataset

Type Train Dev Test
LOC 8,831 3,043 2,525
ORG 3,471 1,203 1,616
PER 6,427 2,168 3,518
MISC 805 179 296

Leaderboard

Model F1 Paper/Source Code
VNER
Attentive Neural Network
77.52 Dong et al. '18
vietner
CRF (ngrams + word shapes + cluster + w2v)
76.63 Pham et al. VLSP'18 Official
ZA-NER
BiLSTM
74.70 Luong et al. VLSP'18
Dong et al. 2018 66.07 Dong et al. VLSP'18

VLSP 2016 Shared Task: Named Entity Recognition

19,692 sentences

  • 14,861 sentences are used for training.
  • 2,000 sentences are used for development.
  • 2,831 sentences are used for testing.

Leaderboard

Without gold POS and chunking tags

Model F1 Paper/Source Code
PhoBERT-large 94.7 Nguyen et al. '20 Official
PhoBERT-base 93.6 Nguyen et al. '20 Official
VnCoreNLP
used ETNLP embeddings
91.30 Nguyen et al. NAACL'18 Official
VNER
Attentive Neural Network
90.37 Dong et al. '18
vietner
CRF (ngrams + word shapes + cluster + w2v)
90.03 Pham CICLing'18 Official
VnCoreNLP
dynamic feature induction model
88.55 Nguyen et al. NAACL'18 Official

With gold POS and chunking tags

Model F1 Paper/Source Code
VNER
Attentive Neural Network
95.33 Dong et al. '18
BiLSTM-CRF + POS + Chunk 94.88 Nguyen et al. 2018 Official
CRF (PoS, Chunk, word + word shapes + cluster + w2v) 93.93 Pham CICLing'18
NNVLP (BiLSTM-CNN-CRF) 92.91 Pham et al. IJCNLP'17 Official
vie-ner-lstm 92.05 Pham et al. PACLIC'17 Official
Token reguilar expression + ME (Bidirectional Inference) 88.78 Le et al. VLSP'16
BiLSTM-CNN-CRF 88.59 Pham et al. PACLIC'17
ME + Beam Search 84.08 Nguyen et al. VLSP'16
Stack LSTM 83.80 Nguyen et al. VLSP'16
BiLSTM-CRF 83.25 Nguyen et al. VLSP'16
CRF 78.38 Le et al. VLSP'16

Miscellaneous

📜 Papers

📁 Open sources