FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages
===============
Sample source code, data and models for our ECTEL 2024 accepted paper (preprint): FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages
Abstract: Question Answering (QA) datasets are crucial in assessing reading comprehension skills for both machines and humans. While numerous datasets have been developed in English for this purpose, a noticeable void exists in less-resourced languages. To alleviate this gap, our paper introduces machine-translated versions of FairytaleQA, a renowned QA dataset designed to assess and enhance narrative comprehension skills in young children. By employing fine-tuned, modest-scale models, we establish benchmarks for both Question Generation (QG) and QA tasks within the translated datasets. In addition, we present a case study proposing a model for generating question-answer pairs, with an evaluation incorporating quality metrics such as question well-formedness, answerability, relevance, and children suitability. Our evaluation prioritizes quantifying and describing error cases, along with providing directions for future work. This paper contributes to the advancement of QA and QG research in less-resourced languages, promoting accessibility and inclusivity in the development of these models for reading comprehension.
Authors: Bernardo Leite, Tomás Freitas Osório, Henrique Lopes Cardoso
- Machine-Translated Data
- Training, inference and evaluation scripts for Question Answering (QA) & Generation (QG)
- Fine-tuned models for QA & QG
You can find here the machine-translated versions of FairytaleQA:
We also have included machine-translated datasets for Italian and Romanian, although they were not studied in this research:
You can find here the fine-tuned models for Question Answering (QA):
You can find here the fine-tuned models for Question Generation QG:
Python 3 (tested with version 3.8.5 on Ubuntu 20.04.1 LTS)
- Clone this project:
git clone https://github.com/bernardoleite/fairytaleqa-translated
- Install the Python packages from requirements.txt. If you are using a virtual environment for Python package management, you can install all python packages needed by using the following bash command:
cd fairytaleqa-translated/ pip install -r requirements.txt
You can use this code for data preparation, training, inference/predicting and evaluation.
You can download the datasets from the links above (see Machine-Translated Data). Put them in the data folder.
-
Go to
src/model
. The filetrain.py
is responsible for the training routine. Type the following command to read the description of the parameters:python train.py -h
You can also run the example training script (linux and mac)
train_script.sh
:bash train_script.sh
The previous script will start the training routine with predefined parameters:
#!/usr/bin/env bash python train.py \ --language "ptpt" \ --dir_model_name "qg_ptpt_ptt5_base_answer-text_question_seed_45_exp" \ --model_name "unicamp-dl/ptt5-base-portuguese-vocab" \ --tokenizer_name "unicamp-dl/ptt5-base-portuguese-vocab" \ --train_path "../../data/FairytaleQA_Dataset/processed_gen_v2_ptpt/train.json" \ --val_path "../../data/FairytaleQA_Dataset/processed_gen_v2_ptpt/val.json" \ --test_path "../../data/FairytaleQA_Dataset/processed_gen_v2_ptpt/test.json" \ --max_len_input 512 \ --max_len_output 128 \ --encoder_info "answer_text" \ --decoder_info "question" \ --max_epochs 1 \ --batch_size 16 \ --patience 2 \ --optimizer "AdamW" \ --learning_rate 0.0001 \ --epsilon 0.000001 \ --num_gpus 1 \ --seed_value 45
-
In the end, model checkpoints will be available at
checkpoints/checkpoint-name
.
Note: You can change encoder_info parameter as follows:
- answer_text: Encodes answer + text
- question_text: Encodes question + text
You can change decoder_info parameter as follows:
- question: Decodes question
- answer: Decodes Answer
Go to src/model
. The script file inference_script.sh
is an example for the inference routine.
Important note: In inference_script.sh
(checkpoint_model_path parameter), replace XX and YY according to epoch number and loss. After infernce, predictions will be saved under predictions
dolder.
See this example for QG in Portuguese (under Load Model and Tokenizer). You can use any of the fine-tuned models listed above.
- For QG evaluation, you first need to install/configure Rouge
- Go to
src/eval-qg.py
file - See preds_path list and choose (remove or add) additional predictions
- Run
src/eval-qg.py
to computer evaluation scores
- For QA evaluation, you first need to install/configure Rouge
- Go to
src/eval-qa.py
file - See preds_path list and choose (remove or add) additional predictions.
- Run
src/eval-qa.py
to computer evaluation scores
To ask questions, report issues or request features, please use the GitHub Issue Tracker.
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks in advance!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This code is released under the MIT license. For details, please see the file LICENSE in the root directory. Please refer to machine-translated data and fine-tuned models links for their licenses.
The base code is based on a previous implementation.
If you use this software in your work, please kindly cite our research.
Our paper (preprint - accepted for publication at ECTEL 2024):
@article{leite_fairytaleqa_translated_2024,
title={FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages},
author={Bernardo Leite and Tomás Freitas Osório and Henrique Lopes Cardoso},
year={2024},
eprint={2406.04233},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Original FairytaleQA paper:
@inproceedings{xu-etal-2022-fantastic,
title = "Fantastic Questions and Where to Find Them: {F}airytale{QA} {--} An Authentic Dataset for Narrative Comprehension",
author = "Xu, Ying and
Wang, Dakuo and
Yu, Mo and
Ritchie, Daniel and
Yao, Bingsheng and
Wu, Tongshuang and
Zhang, Zheng and
Li, Toby and
Bradford, Nora and
Sun, Branda and
Hoang, Tran and
Sang, Yisi and
Hou, Yufang and
Ma, Xiaojuan and
Yang, Diyi and
Peng, Nanyun and
Yu, Zhou and
Warschauer, Mark",
editor = "Muresan, Smaranda and
Nakov, Preslav and
Villavicencio, Aline",
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = may,
year = "2022",
address = "Dublin, Ireland",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.acl-long.34",
doi = "10.18653/v1/2022.acl-long.34",
pages = "447--460"
}
T5 model:
@article{raffel_2020_t5,
author = {Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu},
title = {Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer},
journal = {Journal of Machine Learning Research},
year = {2020},
volume = {21},
number = {140},
pages = {1-67},
url = {http://jmlr.org/papers/v21/20-074.html},
note={Model URL: \url{huggingface.co/google-t5/t5-base}}
}
- Bernardo Leite, bernardo.leite@fe.up.pt
- Tomás Freitas Osório, tomas.s.osorio@gmail.com
- Henrique Lopes Cardoso, hlc@fe.up.pt