Analysis of the real estate market of Rome (Italy) using data scraped from the Immobiliare.it website
-
Data cleaning
-
Exploratory Data Analysis.
Python Version: 3.8.2
Packages: Pandas, Numpy, Matplotlib, Seaborn, Wordcloud, NLTK
- Pull out information from string features and m
- Fill missing numerical values with feature median
- Convert Object data into numerical
- Create a binary column for missing data with Boolean values
The EDA made shows how data is distributed and relation between different features.
Used both wordcloud
and nltk
to represent and give a deeper insight on the description
column.
Following few highlights from the graphs dispalyed: