Readme

Data

We have used the Chaii dataset and the externally available MLQA and XQUAD datasets. The MLQA and XQUAD datasets need to be preprocessed before use. Run the 'preprocessing/mlqa-xquad-preprocessing-and-eda.ipynb' file to get the data. Two csv files with the names 'mlqa_hindi.csv' and 'xquad.csv' will get generated. Download the chaii dataset from this link: 'https://www.kaggle.com/c/chaii-hindi-and-tamil-question-answering/data' by pressing Download All button. Unzip the downloaded file and that's the chaii dataset.

BaseLine Model

Place the chaii-dataset folder in '../input/'. Download the pre-trained xlma roberta model from 'https://www.kaggle.com/nbroad/xlm-roberta-squad2'. and store it '../input/'. Then simply run the notebook.

XLM Roberta

Fine Tuning

Place the chaii-dataset folder in '../input/'. Place 'mlqa_hindi.csv' and 'xquad.csv' in '../input/mlqa-hindi-processed/'. Then simply run the notebook.

Inference

Repeat the fine tuning step to get the data. Download the saved model from the fine-tuning and place it in '../input/' and run the notebook.

MuRIL Large

Fine Tuning

Place the chaii-dataset folder in '../input/'. Place 'mlqa_hindi.csv' and 'xquad.csv' in '../input/mlqa-hindi-processed/'. Download the pre-trained MuRIL Large dataset from 'https://www.kaggle.com/nbroad/muril-large-pt' and place it in '../input'. Then simply run the notebook.

Inference

Repeat the fine tuning step to get the data. Download the saved model from the fine-tuning and place it in '../input/' and run the notebook.

RemBert

Fine Tuning

Place the chaii-dataset folder in '../input/'. Place 'mlqa_hindi.csv' and 'xquad.csv' in '../input/mlqa-hindi-processed/'. Download the pre-trained RemBERT dataset from 'https://www.kaggle.com/nbroad/rembert-pt' and place it in '../input'. Then simply run the notebook.

Inference

Repeat the fine tuning step to get the data. Download the saved model from the fine-tuning and place it in '../input/' and run the notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Baseline		Baseline
Chaii-dataset eda		Chaii-dataset eda
MuRIL-Large		MuRIL-Large
Preprocessing		Preprocessing
RemBERT		RemBERT
XLM_R		XLM_R
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Readme

Data

BaseLine Model

XLM Roberta

Fine Tuning

Inference

MuRIL Large

Fine Tuning

Inference

RemBert

Fine Tuning

Inference

About

Releases

Packages

Languages

Jeet-Patel/chaii-MQA_CS565

Folders and files

Latest commit

History

Repository files navigation

Readme

Data

BaseLine Model

XLM Roberta

Fine Tuning

Inference

MuRIL Large

Fine Tuning

Inference

RemBert

Fine Tuning

Inference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages