Skip to content

Karsten-Yan/200303_House_Prices_Project

Repository files navigation

House price prediction for King County, Washington

In this project I analysed the data sample on house price sales in King County, Washington.

In King_County_House_prices.csv you can find the Raw Data, which serves as the basis for the analysis.

column_names.md contains explanations and details for the columns.

The notebook 200303_Karsten_Yan_Project1.ipnyb contains the main analysis of the data. I mainly focused on newly renovated or newly built (<= 5 Years) Houses and modeled my predictions and visualisations for that subset.

You can find the non technical presentation under king_county_presentation.pdf or directly on Google Presentations

The notebook is split into 9 subcategories.

  1. Business understanding:
  • Formulation of target
  1. Data mining:
  • Import of necessary modules and accessing raw data into pandas data frame
  1. Data Cleaning:
  • removal of unnecessar columns (view and id) and conversion of sqft_basement to numerical values
  1. Feature engeneering:
  • years since last renovation or construction
  • bathroom/bedroom ratio
  • zip code price ranks
  • quality
  • dummy variables
  • cleanup
  • definition of parameters
  1. Data exploration:
  • correlation heatmap
  • pairplots
  1. Statistic modeling:
  • brute force approach, iterating through each variable, choosing highest r sqaure adj, begin loop from beginning including formerly chosen variable
  • modeling for all homes according to search parameters
  • modeling for newly constructed homes (less than 5 years)
  1. Visualisation:
  • visualisations for newly constructed homes
  • comparing quality and quantity and some basic features
  1. Summary
  2. Future Work

About

NFDS Project 1

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published