Cardiovascular Risk Prediction Classification Project

An analysis of cardiovascular risk prediction using machine learning techniques.

Project Overview

This project focuses on predicting the 10-year risk of cardiovascular disease using demographic, clinical, and laboratory data. Various machine learning algorithms are applied and evaluated for their performance in predicting cardiovascular risk.

Key Findings

Age and Gender: Age and gender are significant risk factors for cardiovascular disease, with men being more likely to develop CHD than women.
Smoking: Smoking is a risk factor for CHD, and smoking intensity plays a role in determining the risk.
Clinical Variables: High blood pressure, stroke, and diabetes are associated with a higher risk of CHD.
Laboratory Variables: Patients with high cholesterol levels may be at a slightly higher risk for CHD.
Model Performance: Random Forest Classifier and XGBoost models performed the best, with high accuracy, precision, and recall scores.
Accuracy Rate: The Random Forest Classifier model achieved an accuracy rate of 90.36% in predicting cardiovascular risk.

Tools and Skills

Python: Used for data analysis, manipulation, and visualization.
Pandas: Employed for data manipulation and analysis.
Matplotlib and Seaborn: Utilized for data visualization to create insightful plots and graphs.
Scikit-learn: Implemented various machine learning algorithms for predictive modeling.

Model Performance Metrics

Model	Test Accuracy	Test Precision	Test Recall	Test ROC AUC
Logistic Regression	0.6571	0.6273	0.6945	0.6587
Random Forest Classifier	0.9036	0.8791	0.9255	0.9046
XGBoost	0.9019	0.8951	0.9000	0.9018
KNN	0.8194	0.7317	0.9818	0.8265
SVC	0.7899	0.7369	0.8709	0.7934
NBClassifier	0.5694	0.6985	0.1727	0.5523

Takeaways

Improved Risk Assessment: Machine learning models can provide more accurate predictions of cardiovascular risk compared to traditional risk assessment methods.
Early Intervention: Early identification of individuals at high risk of cardiovascular disease allows for timely intervention and preventive measures.
Personalized Medicine: Machine learning models can help tailor interventions and treatments based on individual risk profiles.
Healthcare Resource Allocation: Predictive models can assist healthcare providers in allocating resources more efficiently by targeting high-risk individuals.

Acknowledgments

Special thanks to the Framingham Heart Study for providing the dataset used in this project.

This project was completed as part of the Data Science Trainee program at AlmaBetter.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
Cardiovascular_Risk_Prediction.ipynb		Cardiovascular_Risk_Prediction.ipynb
README.md		README.md
data_cardiovascular_risk.csv		data_cardiovascular_risk.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cardiovascular Risk Prediction Classification Project

Project Overview

Key Findings

Tools and Skills

Model Performance Metrics

Takeaways

Acknowledgments

About

Releases

Packages

Languages

Navjotkhatri/CARDIOVASCULAR-RISK-PREDICTION

Folders and files

Latest commit

History

Repository files navigation

Cardiovascular Risk Prediction Classification Project

Project Overview

Key Findings

Tools and Skills

Model Performance Metrics

Takeaways

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages