-
python version = 3.6.12
-
pip install -r requirements.txt
Training corpora are from Task1-Marking_Task in NUS-MOOC-Transacts-Corpus
Preprocessing and feature engineering over the corpora takes long time (nearly 2 hours), so we save feature engineering vectors into files, then we only need to load these files for next studying such as : hyper-parameter tuning, ablation testing, model perfomance verifying ...
- python main.py --feature tf_idf --dataset unprocessed
- python main.py --feature tf_idf --dataset processed
- python main.py --feature word2vec --dataset unprocessed
- python main.py --feature word2vec --dataset processed
- f1 score on test data = 0.88
- accuracy score on test data = 0.81
- precision score on test data = 0.85
- recall score on test data = 0.92
- f2 score on test data = 0.91
- f1 score on test data = 0.88
- accuracy score on test data = 0.82
- precision score on test data = 0.85
- recall score on test data = 0.92
- f2 score on test data = 0.90