Skip to content

This project used machine learning models to detect cancer in biopsies with an accuracy of 98%

Notifications You must be signed in to change notification settings

jarred13/Breast_Cancer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Breast Cancer

Overview

In this notebook we take a dataset of breast cancer biopsies and apply six machine learning models to create predictions for whether the patient has a malignant or benign tumor. The best performing model was able to achieve a F1 of .977. This notebook is hosted on Kaggle and can be found here: https://www.kaggle.com/code/jarredpriester/machine-learning-ensemble-breast-cancer-prediction

Purpose of this Project

First, I was wanting to practice working with machine learning models. Second, I am curious to see how data science can be used in healthcare.

What Did I Learn

I learned that machine learning can be very effective in the healthcare industry. I gained more experience in the caret library, especially with fine tuning the random forest and the K nearest neighbor models.

Dataset Used

The Breast Cancer Wisconsin (Diagnostic) Dataset is a popular dataset from the University of California Irvine Machine Learning Repository. The dataset consists of 529 rows and 32 columns. Each row represents a tumor sample and each column represents a feature.

File Used

Breast_Cancer_Kaggle.R - R script
Breast_Cancer_Kaggle.Rmd - R Markdown
Breast_Cancer_Kaggle.pdf - PDF