Skip to content

This is an end-to-end machine learning model in which I implement random-forest and decision tree classifiers to predict heart disease. I utilized cross-validation, and oversampling to deal with an imbalanced dataset.

Notifications You must be signed in to change notification settings

micahwiesner67/Decision_Tree_Classifier_Heart_Disease

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Decision_Tree_Classifier_Heart_Disease

I implemented an end-to-end machine learning model utilizing decision trees and random forests to predict heart disease due to a variety of environmental and biologic factors. In this project I really delved under the hood to better understand the hyperparameter tuning of each model. One large difficulty in creating this model was that the dataset was extremely imbalanced.

Questions:

  1. Which factors contribute most to an individual being at risk for coronary heart disease (CHD)?
  2. How can an imbalanced dataset be mitigated?

Dataset:

Data Analysis:

Decision Tree Classifier, Random Forest Classifier, Imbalanced Data

-All analysis and visualization done in Python using pandas numpy sklearn seaborn matplotlib

About

This is an end-to-end machine learning model in which I implement random-forest and decision tree classifiers to predict heart disease. I utilized cross-validation, and oversampling to deal with an imbalanced dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages