Skip to content

jasonhckim/PCA_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Principal Component Analysis PROJECT

Context

Consider the situation where you are working for Zillow as a data scientist

Housing pricing predictions is the goal.

We know 80 things about each house to use as inputs to be able to predict the price of a house.

Your goal is to isolate the important features from the dataset and build a model which can be used to predict the price of the houses.

Since there are too many features, PCA can be applied to reduce the number of features used for the actual prediction model, without any loss of information.

Assignments

Data Cleaup and Exploratory Data Analysis

  1. Explore Basic Statistics of each feature
  2. Outlier Detection
  3. Missing value imputations
  4. Correlation Analysis

Feature Preparation and Transformation

  1. Drop unnecessary Columns
  2. Apply Scaling to dataset to bring all variables to the same scale
  3. Feature Selection for isolating final set of variables for PCA

PCA

  1. Threshold for Variance
  2. Balance the number of features selected

Linear Regresssion

  1. Fit model to cleaned-up dataset
  2. Comparative Study of with and without PCA

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published