This project aims to predict cancer mortality rates in US counties using machine learning techniques. The dataset used for this project contains various demographic and medical features of US counties, consolidated from census data. The project is divided into four parts, each focusing on different aspects of regression modeling.
data/
: Contains the dataset files (cancer_us_county-training.csv
andcancer_us_county-testing.csv
).notebooks/
: Contains Jupyter notebooks for each part of the assignment.Part_A_Univariate_Linear_Regression.ipynb
Part_B_Multivariate_Linear_Regression.ipynb
Part_C_Experiment_On_Multivariate_Linear_Regression_With_Feature_Engineering.ipynb
EXPERIMENT REPORT
: Contains experiment reports in Word format.EXPERIMENT REPORT - Part A
EXPERIMENT REPORT - Part B
EXPERIMENT REPORT - Part C
FINAL REPORT - Part D
: Contains final report in Word format for the project.README.md
: Overview of the project (this file).
- Ensure you have Python and Jupyter Notebook installed on your system.
- Clone this repository to your local machine
For more information, please refer to the web app presentation of the project using the following link:
Regression-Model-on-Cancer-US-County
This project is based on the dataset consolidated from census data in the USA and its documentation provided for educational purposes. (Dataset is from the Master of Data Science and Innovation course of University of Technology Sydney, and it is the asset of TD School)