A natural language processing pipeline to evaluate antithrombotic use in multi-hospital atrial fibrillation cohort

This repository provides a natural language processing (NLP) based analysis pipeline to evaluate antithrombotic use in individuals with atrial fibrillation using discharge summaries from hospital electronic health records.

The pipeline builds on existing opensource NLP software, specifically CogStack for document storage and MedCAT for document annotation.

The code is designed to be run within a Docker container than can be built from the Dockerfile provided. The infrastructure pre-requisites to run this code are a server with Docker and CogStack installed. Target discharge summaries must then be ingested into the CogStack instance and the config.py file updated with the relevant Elasticsearch index.

Metadata for the discharge summaries should also be available and mapped to the following fields in CogStack:

"clinicalnotekey": unique document ID
"patientprimarymrn": unique patient ID for document
"encounterdate": date document was created
"gender": recorded gender of the patient
"date_of_birth": recorded date of birth of the patient
"notetext": free text from the document. This may need to be pre-processed using a service such as Apache Tika if documents are stored as PDFs.

All other data (e.g. risk scores, medication summaries) will be created and summarised automatically from document annotations.

Sensitive, NHS Trust specific implementation details (e.g. passwords, server configurations, pre-trained annotation models) are not included in this repository so the code will not work out of the box. Please get in touch with a.handy@ucl.ac.uk if you would like to install this pipeline at your hospital.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
pipeline		pipeline
.DS_Store		.DS_Store
Dockerfile		Dockerfile
README.md		README.md
create_validation_sample.py		create_validation_sample.py
requirements.txt		requirements.txt
run_pipeline.py		run_pipeline.py
test_validation_sample.py		test_validation_sample.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A natural language processing pipeline to evaluate antithrombotic use in multi-hospital atrial fibrillation cohort

About

Releases

Packages

Languages

AlexHandy1/at-evaluation-nlp-pipeline

Folders and files

Latest commit

History

Repository files navigation

A natural language processing pipeline to evaluate antithrombotic use in multi-hospital atrial fibrillation cohort

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages