This project aims to predict the likelihood of heart disease among individuals using machine learning techniques. Leveraging a subset of the Behavioral Risk Factor Surveillance System (BRFSS) dataset, we delve into various data science methodologies to develop a robust classification model.
The project utilizes a cleaned subset of the BRFSS 2013 dataset, containing 384,695 survey responses from individuals and 20 pertinent features related to cardiovascular health.
The project employs a range of data science techniques, including but not limited to:
- EDA
- Data Cleaning
- Principal Component Analysis (PCA)
- Regularization
- Scaling
- Learning Curves
- Decision Trees
- Logistic Regression
- Random Forests
- Hyperparameter Tuning
- Model Validation
The ultimate goal is to develop a classification model capable of accurately predicting heart disease risk. By identifying early indicators, we aim to assist in early detection and intervention, potentially saving lives and reducing healthcare costs.
Heart disease remains the leading cause of death globally, underscoring the significance of proactive measures for cardiovascular health. This project seeks to empower individuals with predictive insights, enabling them to make informed decisions regarding their well-being.
ML-Heartbeat-Prognostic.ipynb
: Jupyter Notebook containing the project introduction and code implementation.- Additional datasets from 2011 and 2015 are used for comparison against the 2013-based model.
Through the application of advanced data science techniques, this project endeavors to contribute to the ongoing efforts in combating heart disease, fostering a healthier future for individuals worldwide.