CPSC392 Data Science Spotify Final Project

Using Spotify dataset to answer these questions:

What clusters can we find between loudness and energy?
Which variables are the least useful in gathering information about the popularity of a song?
Is there an association between the duration and popularity of songs?
Using danceability, energy, and loudness, can we accurately predict whether a song will be popular?
Can we accurately predict the genre of a song based on the set predictors?
Which genres and subgenres show up the most in the data? Which are the most popular?
Can we appropriately cluster songs by genre using album name, track popularity and artist name?
Is there a hierarchical relationship between the variables?
Would PCA help us reduce dimensionality and produce a model that can accurately predict a song's popularity?

Models used: Expectation-Maximization with Gaussian Mixture, Lasso Regression, Linear Regression, K-Means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Decision Tree, and Principal Component Analysis

Tools used for hyperparameter tuning: Elbow Method for choosing epochs, Scree Plot to visualize how many principal components it takes to retain the amount of information we want

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
SpotifyModels&Analysis.ipynb		SpotifyModels&Analysis.ipynb
SpotifyModels&Analysis.pdf		SpotifyModels&Analysis.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CPSC392 Data Science Spotify Final Project

About

Releases

Packages

Languages

kashishpandey/CPSC392-Spotify

Folders and files

Latest commit

History

Repository files navigation

CPSC392 Data Science Spotify Final Project

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages