Skip to content

Supervised ML - Classification Using Python this project demonstrates the effectiveness of machine learning techniques in predicting cardiovascular risk using the Framingham Heart Study dataset. The developed machine learning model can be used by healthcare professionals to identify individuals at high risk of cardiovascular disease .

Notifications You must be signed in to change notification settings

Navjotkhatri/CARDIOVASCULAR-RISK-PREDICTION

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 

Repository files navigation

Cardiovascular Risk Prediction Classification Project

An analysis of cardiovascular risk prediction using machine learning techniques.

Project Overview

This project focuses on predicting the 10-year risk of cardiovascular disease using demographic, clinical, and laboratory data. Various machine learning algorithms are applied and evaluated for their performance in predicting cardiovascular risk.

Python Pandas Matplotlib Seaborn Scikit-learn

Jupyter Notebook Google Colab GitHub

Logistic Regression Random Forest Classifier XGBoost KNN SVC NBClassifier

Key Findings

  • Age and Gender: Age and gender are significant risk factors for cardiovascular disease, with men being more likely to develop CHD than women.
  • Smoking: Smoking is a risk factor for CHD, and smoking intensity plays a role in determining the risk.
  • Clinical Variables: High blood pressure, stroke, and diabetes are associated with a higher risk of CHD.
  • Laboratory Variables: Patients with high cholesterol levels may be at a slightly higher risk for CHD.
  • Model Performance: Random Forest Classifier and XGBoost models performed the best, with high accuracy, precision, and recall scores.
  • Accuracy Rate: The Random Forest Classifier model achieved an accuracy rate of 90.36% in predicting cardiovascular risk.

Tools and Skills

  • Python: Used for data analysis, manipulation, and visualization.
  • Pandas: Employed for data manipulation and analysis.
  • Matplotlib and Seaborn: Utilized for data visualization to create insightful plots and graphs.
  • Scikit-learn: Implemented various machine learning algorithms for predictive modeling.

Model Performance Metrics

Model Test Accuracy Test Precision Test Recall Test ROC AUC
Logistic Regression 0.6571 0.6273 0.6945 0.6587
Random Forest Classifier 0.9036 0.8791 0.9255 0.9046
XGBoost 0.9019 0.8951 0.9000 0.9018
KNN 0.8194 0.7317 0.9818 0.8265
SVC 0.7899 0.7369 0.8709 0.7934
NBClassifier 0.5694 0.6985 0.1727 0.5523

Takeaways

  • Improved Risk Assessment: Machine learning models can provide more accurate predictions of cardiovascular risk compared to traditional risk assessment methods.
  • Early Intervention: Early identification of individuals at high risk of cardiovascular disease allows for timely intervention and preventive measures.
  • Personalized Medicine: Machine learning models can help tailor interventions and treatments based on individual risk profiles.
  • Healthcare Resource Allocation: Predictive models can assist healthcare providers in allocating resources more efficiently by targeting high-risk individuals.

Acknowledgments

Special thanks to the Framingham Heart Study for providing the dataset used in this project.

This project was completed as part of the Data Science Trainee program at AlmaBetter.

LinkedIn

About

Supervised ML - Classification Using Python this project demonstrates the effectiveness of machine learning techniques in predicting cardiovascular risk using the Framingham Heart Study dataset. The developed machine learning model can be used by healthcare professionals to identify individuals at high risk of cardiovascular disease .

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published