This repository contains the code to reproduce the results of the paper (Applying Model-agnostic Methods to Handle Inherent Noise in Large Scale Text Classification) accepted at COLING 2020.
- Python 3.7
- Pandas
- Numpy
- Keras
- SkLearn
- Pickle
git clone https://github.com/tayalkshitij/model-agnostic-methods.git
cd model-agnostic-methods/
Train main model:
python code/main_experiment/main.py <path_to_dataset>
Train noise model:
python code/noise_experiment/noise_main.py <path_to_dataset>
Google drive link for the datasets are as follow:
Automotive Dataset Link. Beauty Dataset Link. Electronics Dataset Link.
Glove Link.
If you have any question, please contact the author: Kshitij Tayal (tayal007@umn.edu)
See the LICENSE file for more details.
When using the dataset or code, please cite our paper:
@article{tayalmodel,
title={Model-agnostic Methods for Text Classification with Inherent Noise},
author={Tayal, Kshitij and Ghosh, Rahul and Kumar, Vipin},
journal={The 28th International Conference on Computational Linguistics},
year={2020}
}
The codebase is based off D2L and edufonseca. Both are great repositories - have a look!