In this project, I clustered 5,000 Spotify songs using unsupervised machine learning.
- Data Cleaning: Cleaned the data to remove duplicates and empty values.
- Feature Selection: Chose features like
danceability
,valence
,loudness
,energy
,acousticness
, etc., for clustering. - Clustering: Used KMeans for clustering the songs based on the selected features.
- Dimensionality Reduction: Applied PCA to enhance clustering efficiency and interpretability.
- Data Standardization: Standardized the data using MinMaxScaler to scale features between 0 and 1.
- Determining Clusters: Used the Elbow Method based on Inertia score and Silhouette score to determine the optimal number of clusters.
- Playlist Creation: Generated playlists based on KMeans clustering of similar songs closest to their cluster centers.