Final Project: Zero Shot Entity Linking with Bi-Encoder

This is a re-implementation of this model on the dataset of this paper. The model is a Bi-Encoder that scores a context with named entity (the string for the entity is marked) against 64 candidate entities (assumed to have already been provided through some candidate generation process, BM25 in this case).

How to Run

Install dependencies from requirements.txt, run train.py for training and test.py for testing. File paths and other model details can be found in configs/config.ini. See nlp_entity_linking.ipynb for example run on Google Colab, replace data paths with data from here or from here, which also includes the chekpoint.

Training note: On Colab's T4, 1 epoch takes ~2h50m

Results

Accuracy on test set: 0.677 (Prediction deemed accurate when the model scores the gold entity candidate higher than all other candidates)

Various Implementation/Formulation Details

The encoder is DeBERTa xsmall (chosen with regards to the need of having 2 encoders and a large-ish batch size for in-batch negatives with restrictive resources).
Assuming a gold candidate always exist. Otherwise, this can be handled by using a common 'unknown' gold candidate for NIL. (Also assuming a set of candidates of at least 1 exist)
Trained with gradual unfreezing, gradient accumulation, and a pseudo early stopping (saving checkpoint on new lowest validation loss)
Trained with in-batch negatives. Additional to hard negatives (other top-64 candidates for that entity), we also optimize the scoring against candidates of other entities of the same batch (assuming no overlap and duplicates, so no situation where a model has to score gold candidates against each other). Better explanation here.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
configs		configs
data/raw		data/raw
output		output
src		src
LICENSE.md		LICENSE.md
README.md		README.md
foxinsocks.py		foxinsocks.py
nlp_entity_linking.ipynb		nlp_entity_linking.ipynb
requirements.txt		requirements.txt
run_classifier.py		run_classifier.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Final Project: Zero Shot Entity Linking with Bi-Encoder

How to Run

Results

Various Implementation/Formulation Details

About

Releases

Packages

Languages

License

ndrwhoang/final_entity_linking

Folders and files

Latest commit

History

Repository files navigation

Final Project: Zero Shot Entity Linking with Bi-Encoder

How to Run

Results

Various Implementation/Formulation Details

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages