Winning solution
Predicting real estate sale prices using property data.
The code used for our final submission can be found in final_submission.ipynb.
Data can be downloaded from the Kaggle competition data page.
In the repo, data is in the /data
directory.
There are 3 data files:
output/sample_submission.csv is an example of a file that is ready to submit to Kaggle. There are two columes: id
and SALE PRICE
.
https://www.kaggle.com/c/saas-2020-fall-cx-kaggle-compeition
Building codes: https://www1.nyc.gov/assets/finance/jump/hlpbldgcode.html
TODO
- remove outliers
- check negative price predicitons
-Check if building or tax class changes
- could mean redeveloped housing
- add column "classChanged" - 1 if yes, 0 if no
- Check if apartment number is present
- add column "hasApartmentNumber" - 1 or 0