I generated useful insights from the New York Airbnb data using Jupyter Notebook tool with Python 3 programming language.
-
Pandas
-
Numpy
-
Sklearn
-
Seaborn
-
Matplotlib
These questions motivated me to find useful results from the data:
-
How many hosts are in every neighbourhood_group? what are the top host ids who have the highest number of rooms in every neighbourhood_group?
-
What are the features that affect on the price? can we predict the price from these features?
-
Which neighbourhood has the highest number of reviews, and the highest number of rooms for every neighbourhood group? what can we learn from the results?
-
AB_NYC_2019.csv : this file contains the data which was used in the process, it was downloaded from Kaggle
-
New York City Airbnb: This is the Jupyter Notebook which contains all of the code of the data science process with some markdown cells
This project extracted some of many useful insights associated with the listings in New York city AriBnB in 2019, specifically these insights related to the prices, hosts, room types in every neighborhood group in New York, I have used one data-set in the analysis: New York City Airbnb Open Data.
I have followed the CRISP-DM(Cross-Industry Standard Process for Data Mining) process in the analysis:
- Business Understanding : asking questions related to the business field
- Data Understanding : specifying the columns in our data used in every question
- Data Preperation : cleaning the data and wrangling it to be ready to give us the answer of the question
- Modeling : making a model to predict one of the features
- Evaluation : evaluating the model
- Deployment : communicating the results and the process
- I found the data on Kaggle website
- I used Overleaf (LaTeX) to generate my presentation paper