HeartDetect 💘

An Analytical Model For Early Intervention Of Heart Disease, implemented in 2 stages

Docs

Jupyter notebooks

Executive Summary

This report aims to deploy data analytics to solve the business problem for National Heart Centre Singapore (NHCS). Given the increasing incidence of reported cases of cardiovascular disease (CVD) in Singapore, NHCS handles more than 120,000 outpatient consultations each year. The sudden onset of heart disease is severe and expensive to treat. Therefore, NHCS can shift the focus to early prevention rather than treating post-diagnosis.

To increase the involvement of individuals and primary care sectors in the prevention of heart disease, our team proposes a 2-step solution – HeartDetect.

The first stage is to raise individuals' awareness and manage their heart health regularly.
The second stage is to enable the prediction of heart disease risk in the primary care sector to provide timely prevention.

Getting Started

1. Clone a copy of this repository

Open your terminal and run

git clone https://github.com/xJQx/bc2406-project.git

2. Understanding the jupyter nodebook flow

Data Cleaning and Pre-processing
a) data-cleaning-preprocessing.ipynb

Stage 1:
b) exploratory-data-analysis_1.ipynb
c) stage1-modelling.ipynb

Stage 2:
d) exploratory-data-analysis_2.ipynb
e) stage2-modelling.ipynb

3. Understanding the various csv files (datasets)

View the Data Dictionary here.
Dataset created from the data-cleaning-preprocessing.ipynb notebook:

.
├── heart_pki_2020_original.csv       # original dataset
|   ├── heart_pki_2020_cleaned.csv        # for EDA and visualization
|   └── heart_pki_2020_correlation.csv    # for EDA correlation (IntegerEncoding done)
|   └── heart_pki_2020_encoded.csv        # for analytical models (OneHotEncoding done)
|
├── o2Saturation_original.csv         # original dataset
├── heart_attack_original.csv         # original dataset
│   ├── heart_attack_cleaned.csv          # for EDA and analytical model (default integer encoding)
│   └── heart_attack_cleaned_text.csv     # for EDA and visualization (meaningful values)
└──|

4. Understanding the models directory

The models directory contain all the trained models from stages 1 and 2. They can be imported and used for a dataset that fits their data dimensions.
An example of importing and using an analytical model is as shown:

# Library
import joblib

# Load the model from disk
loaded_random_forest_m3 = joblib.load('models/stage2_random_forest_m3.sav')

# Using the analytical model
result = cross_val_score(loaded_random_forest_m3, X_test, y_test, cv=5, scoring = "roc_auc").mean()
print(result)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HeartDetect 💘

Docs

Jupyter notebooks

Executive Summary

Getting Started

1. Clone a copy of this repository

2. Understanding the jupyter nodebook flow

3. Understanding the various csv files (datasets)

4. Understanding the models directory

Contributors

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
datasets		datasets
docs		docs
models		models
.gitignore		.gitignore
README.md		README.md
data-cleaning-preprocessing.ipynb		data-cleaning-preprocessing.ipynb
exploratory-data-analysis_1.ipynb		exploratory-data-analysis_1.ipynb
exploratory-data-analysis_2.ipynb		exploratory-data-analysis_2.ipynb
stage1-modelling.ipynb		stage1-modelling.ipynb
stage2-modelling.ipynb		stage2-modelling.ipynb

xJQx/bc2406-project

Folders and files

Latest commit

History

Repository files navigation

HeartDetect 💘

Docs

Jupyter notebooks

Executive Summary

Getting Started

1. Clone a copy of this repository

2. Understanding the jupyter nodebook flow

3. Understanding the various csv files (datasets)

4. Understanding the models directory

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages