This project's results are intended to be part of a larger research. Please refer to the following Powerpoint presentation. https://github.com/germanortola/airbnb-bcn/blob/main/output/Airbnb_Barcelona_data_pipelining.pptx
Note for Week 3 deliverables: Please refer to Data Visualization.ipynb file for a working and "clean" version of all produced code. https://github.com/germanortola/airbnb-bcn/blob/main/notebooks/Data%20Visualization.ipynb
The main goal of this project is to answer relevant questions about the current status of Airbnb accommodation in Barcelona, in the year 2022.
This project makes use of different Python libraries for cleaning, processing and transforming dataframes. Also downloads data through API. Once the data is ready, it is visualized to provide with insights to answer the questions that drive this analysis.
What was the current situation of Airbnb in Barcelona as for 2022?
What is the price range?
How many locations are in the city?
How are these locations distributed?
What do guests say about Airbnb in Barcelona?
How much of the accommodation offer does Airbnb represent?
Dowloaded the main database for this project from Inside Airbnb
http://insideairbnb.com/data-requests
This is a project centered in studying and providing data for topics as:
regulations to protect housing impact on housing impact on residential communities touristification and overtourism gentrification unethical tech companies
Cleaned different columns, although the database was solid and very well organized.
Checking the ratio of missing data Dropping null values Checking and Converting data types Checking for duplicates
Python methods, as well as Pandas Library.
The data was transformed by extracting new subsets. Pivot tables were created, aggregating by different basic statistic methods. Count Sum Mean Median Max Min
Pandas library was the main tool here.
The analysis was performed by comparing values from different attributes. Focusing on the original questions was key to staying on track.
Running tests to see the results with Python code.
The process to visualize was the final step in order to achieve thorough interpretation of the data ralationships.
Plotly, Seaborn and Matplotlib libraries were the tools for this task.