A demonstration of the work done for my MSc Computer Science Dissertation. The paper was published in the conference proceedings of the Irish Conference on Artificial Intelligence and Cognitive Science (AICS).
The work looked at predicting the outcome of judicial decisions made by the European Court of Human Rights (ECHR). This was done using Natural Language Processing (NLP) techniques and machine learning.
Publication: here
Full dissertation: dissertation.pdf
Simplified code demonstration: 0 ECHR prediction demo
The full project can be found in src folder and the files are described below:
File | Purpose |
---|---|
functions | Functions used throughout the project to help with data cleaning and model fitting |
obtain_data | Obtain raw ECHR data using the API |
preclean_data | Extract HTML case text and case metadata from raw ECHR data |
clean_dataset | Clean HTML case text so it is in a format ready to create model features |
clean_attributes | Clean case metadata so it is in a format ready to create model features |
create_embeddings | Create word echr2vec embeddings using ECHR documents. This is a novel word embedding created using the word2vec algorithm. |
create_features | Create N-gram, average word embeddings and pharagraph embeddings feature matrices |
fit_autoML | Used to train models using the AutoML framework. These are the models presented in the final framework. |
fit_SVM | Train SVM (initial model prototype) |
fit_CNN | Train CNN (not in final paper) |
data_visualisations | Create various visualisations of the dataset |
results_visualisations | Visualise the model results |
archive (folder) | Contains all old/ test code |
The dataset was accessed using an API provided by vizlegal . If you are interested in using the ECHR docset for academic reasons feel free to contact them.