Skip to content

Latest commit

 

History

History
17 lines (12 loc) · 1.13 KB

README.md

File metadata and controls

17 lines (12 loc) · 1.13 KB

Predicting breast cancer using machine learning

This repository contains source code for a data exploration project which explored various machine learning techniques to ultimately predict malignant or benign breast tumours.

Dataset

The data for this project can be found here: http://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+%28diagnostic%29

Skills

This analysis relied heavily on using R packages for machine learning and compared trees compiled using CART and C4.5 techniques. The influence of balanced class design was also explored for each algorithm.

Complete Analysis

Complete analysis for this project can be found here: http://cbobbie.wixsite.com/colleenbobbie/predictingbreastcancer

Summary

  • This tutorial explored two decision tree algorithms, CART and C4.5, to help predict whether a breast mass was benign or malignant.
  • Ultimately, the C4.5 balanced class design decision tree produced the highest accuracy and lowest false negative rates.
  • The C4.5 balanced tree highlighted the largest concave point of the cell nuclei as the most influential predictor for tumour outcomes in this dataset.