Group Assignment for the Course NLP Technology at VU University Amsterdam
A.M. Dobrzeniecka (Alicja), E.H.W. Galjaard (Ellemijn), F. den Heijer (Felix), S.Shen (Shuyi)
This project tackles the NLP task of Semantic Role labeling (SRL) with rule based and traditional along with neural machine learning approaches. Semantic roles involve identification of
What you will find in this project:
- A rule based system for predicate and argument identification
- Traditional ML systems for predicate and argument identification, along with argument classification
- A neural SRL system that can tackle SRL on its own
- tokens
- indices
- lemmas
- universal Part-of-Speech tags
- language-specific Part-of-Speech tags
- morphological features
- the ID of the head word
- the universal dependency relation to the head
- head words
- dependency relation:head ID pairs
- named entities
- word embeddings
- PoS-tag of the previous token
- PoS-tag of the next token
-
STEP 1. Install libraries specified in requirements.txt by running
pip install -r requirements.txt
-
STEP 2. Run main.py with proper argument. We have three arguments that have to be provided:
- decide if you want to run rule-based or svm-based identification by stating 'yes' (rule-based) or 'no' (svm-based)
- decide if you want to run the mini version of data or the full one by stating 'yes' (if mini) or 'no' (if full)
- decide if you want to use embedding as a feature in svm model by stating 'yes' (with embedding) or 'no' (without embedding)
- provide the path to the embedding model your_model_path (you can download the text model from https://wikipedia2vec.github.io/wikipedia2vec/pretrained/)
-
The examplary command would be:
python3 main.py no yes no
The code for the SRL system is stored in the same repositorium.
- STEP 1. Install libraries specified in requirements.txt by running
pip install -r requirements.txt
- STEP 2. Go to srl_assignment_code/ folder and run srl_main.py