Skip to content

conorosully/legal-case-prediction

Repository files navigation

Using Machine Learning to Predict Judicial Decisions

A demonstration of the work done for my MSc Computer Science Dissertation. The paper was published in the conference proceedings of the Irish Conference on Artificial Intelligence and Cognitive Science (AICS).

The work looked at predicting the outcome of judicial decisions made by the European Court of Human Rights (ECHR). This was done using Natural Language Processing (NLP) techniques and machine learning.

Publication: here
Full dissertation: dissertation.pdf

Simplified code demonstration: 0 ECHR prediction demo
The full project can be found in src folder and the files are described below:

File Purpose
functions Functions used throughout the project to help with data cleaning and model fitting
obtain_data Obtain raw ECHR data using the API
preclean_data Extract HTML case text and case metadata from raw ECHR data
clean_dataset Clean HTML case text so it is in a format ready to create model features
clean_attributes Clean case metadata so it is in a format ready to create model features
create_embeddings Create word echr2vec embeddings using ECHR documents. This is a novel word embedding created using the word2vec algorithm.
create_features Create N-gram, average word embeddings and pharagraph embeddings feature matrices
fit_autoML Used to train models using the AutoML framework. These are the models presented in the final framework.
fit_SVM Train SVM (initial model prototype)
fit_CNN Train CNN (not in final paper)
data_visualisations Create various visualisations of the dataset
results_visualisations Visualise the model results
archive (folder) Contains all old/ test code

The dataset was accessed using an API provided by vizlegal . If you are interested in using the ECHR docset for academic reasons feel free to contact them.

About

MSc. Dissertation: using machine learning to predict judicial decisions of the ECHR

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages