Data Pipelining Project

Airbnb Barcelona

Data Analytics Bootcamp @ Ironhack

Disclaimer:

This project's results are intended to be part of a larger research. Please refer to the following Powerpoint presentation. https://github.com/germanortola/airbnb-bcn/blob/main/output/Airbnb_Barcelona_data_pipelining.pptx

Note for Week 3 deliverables: Please refer to Data Visualization.ipynb file for a working and "clean" version of all produced code. https://github.com/germanortola/airbnb-bcn/blob/main/notebooks/Data%20Visualization.ipynb

First: Choosing a subject and asking questions

The main goal of this project is to answer relevant questions about the current status of Airbnb accommodation in Barcelona, in the year 2022.

This project makes use of different Python libraries for cleaning, processing and transforming dataframes. Also downloads data through API. Once the data is ready, it is visualized to provide with insights to answer the questions that drive this analysis.

What was the current situation of Airbnb in Barcelona as for 2022?

What is the price range?

How many locations are in the city?

How are these locations distributed?

What do guests say about Airbnb in Barcelona?

How much of the accommodation offer does Airbnb represent?

Second: Collecting the data

Dowloaded the main database for this project from Inside Airbnb

http://insideairbnb.com/data-requests

This is a project centered in studying and providing data for topics as:

regulations to protect housing impact on housing impact on residential communities touristification and overtourism gentrification unethical tech companies

Third: Data Cleaning

Cleaned different columns, although the database was solid and very well organized.

Checking the ratio of missing data Dropping null values Checking and Converting data types Checking for duplicates

Python methods, as well as Pandas Library.

Fourth: Data Processing

The data was transformed by extracting new subsets. Pivot tables were created, aggregating by different basic statistic methods. Count Sum Mean Median Max Min

Pandas library was the main tool here.

Fifth: Data Analysis

The analysis was performed by comparing values from different attributes. Focusing on the original questions was key to staying on track.

Running tests to see the results with Python code.

Sixth: Data Visualization

The process to visualize was the final step in order to achieve thorough interpretation of the data ralationships.

Plotly, Seaborn and Matplotlib libraries were the tools for this task.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
images		images
notebooks		notebooks
output		output
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Pipelining Project

Airbnb Barcelona

Data Analytics Bootcamp @ Ironhack

Disclaimer:

First: Choosing a subject and asking questions

Second: Collecting the data

Third: Data Cleaning

Fourth: Data Processing

Fifth: Data Analysis

Sixth: Data Visualization

to be continued...

About

Releases

Packages

Languages

germanortola/airbnb-bcn

Folders and files

Latest commit

History

Repository files navigation

Data Pipelining Project

Airbnb Barcelona

Data Analytics Bootcamp @ Ironhack

Disclaimer:

First: Choosing a subject and asking questions

Second: Collecting the data

Third: Data Cleaning

Fourth: Data Processing

Fifth: Data Analysis

Sixth: Data Visualization

to be continued...

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages