Pyspark-Spotify-Analysis

The project is developed on a two "csv" dataset available on the Kaggle platform. The data have been obtained by Spotify. Te main one, "track.csv" the most important and largest, contains music tracks informations from a period of 100 years. The other one instead, "artist.csv",contains a row for each artist. Both the file are comprressed in data.rar. Basing on the suggestion of the dataset's author, we identified three main analysis to apply on the data :

Clustering: on the songs, to identify a limited number of genres
Classification/Regression: to understand which are the most important features in estimating the popularity of a song
Trend Analysis: to see how musical creation changed above the years

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
Classification		Classification
Clustering		Clustering
Data Understanding		Data Understanding
Regression		Regression
Trend Analysis		Trend Analysis
data		data
plot		plot
.gitignore		.gitignore
README.md		README.md
Spotify_Analysis.pdf		Spotify_Analysis.pdf
presentazione.pdf		presentazione.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pyspark-Spotify-Analysis

About

Releases

Packages

Contributors 4

Languages

carloalbe/Pyspark-Spotify-Analysis

Folders and files

Latest commit

History

Repository files navigation

Pyspark-Spotify-Analysis

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages