This was my first personal final project for Coursera IBM Data Science specialization. I eventually ended up doing another project for that specialization but decided to keep this one as this has been my longest and toughest project so far. Unsupervised learning always presents a challenge.
The project revolves around using unsupervised learning to cluster the neighborhoods of Munich in a way that makes sense.
My tasks for this project were:
- Scrape the Munich city portal for the district name - postcode pairs
- Make calls to the Google Maps API to create a DataFrame with district names, postcodes, and their latitudes and longitudes
- Create interactive Folium maps to visualize the venues in districts
- Clean up the DataFrame
- Make calls to the FourSquare API to retrieve the venues in each district
- Try out different distance metrics and algorithms (KDTree vs. BallTre) to determine cluster radii
- Explore the data through visualizations and rankings
- Cluster the data with KMeans to obtain a meaningful represantation of the city
Note: Please view the notebooks through this link to render the Folium maps: https://nbviewer.jupyter.org/