Skip to content

Pyramid is a novel layered model for Nested Named Entity Recognition (nested NER). This code is based on the paper *Pyramid: A Layered Model for Nested Named Entity Recognition* by Jue Wang et al.

Notifications You must be signed in to change notification settings


Repository files navigation



Pyramid is a novel layered model for Nested Named Entity Recognition (nested NER). This code is based in the paper Pyramid: A Layered Model for Nested Named Entity Recognition by Jue Wang et al.


Note that this code is based in my own understanding of the paper. Nevertheless, the authors released the code of the paper at

This repository also contains a step-by-step execution in the notebooks contained in the folder notebooks.

Set up

Clone this repository, create default folders and install dependencies:

git clone
cd pyramid
mkdir data
mkdir artifacts
pip install -r requirements.txt

Download GloVe embeddings:

cd data
wget --no-check-certificate
cd ..

It is necessary that you also download the tokenizer and pretrained LM* beforehand:

python --lm_name dmis-lab/biobert-v1.1

*Feel free to use any pretrained model from HuggingFace:


GENIA is the dataset where I have tested this repository. You can download and prepare this dataset with these commands:

cd data
wget --no-check-certificate
mkdir GENIA
tar -xvf GENIAcorpus3.02.tgz -C GENIA
cd ..
python \
    --dataset genia \
    --raw_filepath "./data/GENIA/GENIA_term_3.02/GENIAcorpus3.02.xml" \
    --lm_name dmis-lab/biobert-v1.1 \
    --cased 0

If you want to use a different dataset, it must be a JSON file as follows:

  "tokens": ["token0", "token1", "token2"],
  "entities": [
      "entity_type": "PER", 
      "span": [0, 1],
      "entity_type": "ORG", 
      "span": [2, 3],


Fine-tune model:

python \
    --model_ckpt ./artifacts/genia/ \
    --wv_file ./data/glove.6B.200d.txt \
    --use_label_embeddings 0 \
    --use_char_encoder 1 \
    --dataset genia \
    --max_epoches 500 \
    --max_steps 1e9 \
    --total_layers 16 \
    --batch_size 64 \
    --token_emb_dim 200 \
    --char_emb_dim 100 \
    --cased_lm 0 \
    --cased_word 0 \
    --cased_char 0 \
    --hidden_dim 100 \
    --dropout 0.45 \
    --lm_name dmis-lab/biobert-large-cased-v1.1 \
    --lm_emb_dim 1024 \
    --device cuda \
    --continue_training 0 \
    --log_to_file logger_genia.txt

Once the model is fine-tunned, run the evaluation script:

python \
    --model_ckpt ./artifacts/genia/ \
    --dataset genia \
    --device cuda


The parameters that you can use are the following ones:

  • device: Device to use: cpu or cuda.
  • model_ckpt: Path to store the model.
  • wv_file: (Optional, default=None) Path to file with embeddings of words. If not provided, it won't use the Word Encoder described in the paper.
  • use_label_embeddings: (Optional, default=0) Uses a label embedding layer in the top of the model.
  • use_char_encoder: (Optional, default=1) Uses the Char Encoder described in the paper.
  • dataset: Name of the dataset to use. The dataset files must be located in the folder ./data with the names train.<dataset>.json, valid.<dataset>.json and test.<dataset>.json for the train, validation and test datasets respectively.
  • max_epoches: (Optional, default=500) Maximum number of epoches for training.
  • max_steps: (Optional, default=1e9) Maximum number of steps for training.
  • total_layers: (Optional, default=16) Number of layers in the pyramid.
  • batch_size: (Optional, default=64) Batch size for training.
  • token_emb_dim: (Optional, default=100) Dimension of token embeddings.
  • char_emb_dim: (Optional, default=100) Dimension of char embeddings.
  • cased_lm: (Optional, default=1) Use cased LM Encoder.
  • cased_word: (Optional, default=1) Use cased Word Encoder.
  • cased_char: (Optional, default=1) Use cased Char Encoder.
  • hidden_dim: (Optional, default=100) Hidden dimension of LSTM layers in the pyramid. Since the LSTM layers are bidirectional, the actual hidden dimension will be twice the value.
  • dropout: (Optional, default=0.45) Dropout rate.
  • lm_name: (Optional, default=dmis-lab/biobert-large-cased-v1.1) Pretrained language model from Hugging Face. The model must be already downloaded in the folder ./artifacts (use the script to download and store it).
  • lm_emb_dim: (Optional, default=1024) Hidden dimension of the language model.
  • continue_training: (Optional, default=0) In order to avoid overriding a trained model, this flag must be set to 1 if we want to continue training a model from a checkpoint. If the model already exists and the flag is 0, it will throw an error.
  • log_to_file: (Optional, default=None) File to store the standard output.

Additional comments

This repository includes sbatch files to run the scripts with Slurm. See:

To do

  • Add option to load pretrained label embeddings for training.

Have fun! ᕙ (° ~ ° ~)



Pyramid is a novel layered model for Nested Named Entity Recognition (nested NER). This code is based on the paper *Pyramid: A Layered Model for Nested Named Entity Recognition* by Jue Wang et al.







No packages published