This repository is the Python implementation of our paper:
Annotating FrameNet via Structure-Conditioned Language Generation
Xinyue Cui, Swabha Swayamdipta
The 62nd Annual Meeting of the Association for Computational Linguistics, 2024
conda create -n framenet python=3.10.11
conda activate framenet
pip install -r requirements.txt
Request and download Framenet Dataset 1.7 from Website. Name the dataset folder fndata-1.7
and place it at the same directory level as the Python scripts.
python preprocess.py
python train_test_split.py
Train T5 model for conditionl generation and save generated data by T5 and GPT-4 models conditioned on different levels of semantic information:
python generation.py
Train SpanBERT model for FE type classification and use it to filter out generated FE spans with inconsistent FE types as the original:
python filter.py
Train SpanBERT model for SRL parsing and evaluate performance trained on unaugmented data and augmented data:
python srl_parser.py