This was my fifth project for our Information Retrieval Course.
A comprehensive comparison between various algorithms used in Information Retrieval Systems, including Naïve Bayes Classifier, Support Vector Machine and FastTex, and Latent Semantic Analysis.
The results are as follows:
Model: | Naïve Bayes |
---|---|
Precision | 0.56761 |
Recall | 0.76176 |
F1-Score | 0.65051 |
Accuracy | 0.61912 |
Model: | SVM |
---|---|
Precision | 0.85074 |
Recall | 0.85544 |
F1-Score | 0.85308 |
Accuracy | 0.85268 |
Model: | SVM + LSA |
---|---|
Precision | 0.84905 |
Recall | 0.85496 |
F1-Score | 0.85199 |
Accuracy | 0.85148 |
Concluding our dataset is very well written yet not simply differentiated, as it did not require Latent Semantic Analysis with such dimension to improve results but confused the Naïve Bayes Classifier nonetheless.