DM 2 Project

Project for Data Mining 2 A.A. 2020/2021

Dataset

The Dataset For Music Analysis reports the data of 106.574 tracks (objects) with their respective 53 attributes where we can find useful information about the the license typology, interest of the track, information of the album, creation of the album.

Files:

Below is the list of files along with its purpose.

Advanced tecniques of clustering: On a dataset already prepared for one of the previous tasks, run at least one clustering algorithm(e.g. X-Means, Bisecting K-Means, OPTICS). Discuss the results that you find analyzing the clusters and reporting external validation measures (e.g SSE, silhouette).
Transactional clustering: By using categorical features, or by turning a dataset with continuous variables into a dataset with categorical variables (e.g. by using binning), run at least one clustering algorithm(e.g. K-Modes, ROCK). Discuss the results that you find analyzing the clusters and reporting external validation measures (e.g SSE, silhouette).
Sequential Pattern Mining: Convert the time series into a discrete format (e.g., by using SAX) and extract the most frequent sequential patterns (of at least length 3/4) using different values of support, then discuss the most interesting sequences.
Time Series Analysis
Advanced Classification Methods(Naive Bayes Classifier, Logistic Regression, Rule-based Classifiers, Support Vector Machines, Neural Networks, Ensemble Methods). Evaluate each classifier with the following techniques (accuracy, precision, recall, F1-score, ROC curve)
Imbalanced Learning and Anomaly Detection
Explainability.ipynb: To use one or more explanation methods (e.g., LIME, LORE, SHAP, etc.) to illustrate the reasons for the classification in one of the steps of the previous tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
Ensemble.ipynb		Ensemble.ipynb
Explainability.ipynb		Explainability.ipynb
Naive_Bayes.ipynb		Naive_Bayes.ipynb
README.md		README.md
Regression.ipynb		Regression.ipynb
Report.pdf		Report.pdf
TS_module3_.ipynb		TS_module3_.ipynb
TS_module3_overall.ipynb		TS_module3_overall.ipynb
TS_shapelet.ipynb		TS_shapelet.ipynb
classification_task_DT.ipynb		classification_task_DT.ipynb
classification_task_KNN.ipynb		classification_task_KNN.ipynb
data_preparation_filled.ipynb		data_preparation_filled.ipynb
dimensionality_reduction.ipynb		dimensionality_reduction.ipynb
imbalance_learning.ipynb		imbalance_learning.ipynb
logistic_regression.ipynb		logistic_regression.ipynb
neural_networks.ipynb		neural_networks.ipynb
outlier_detection.ipynb		outlier_detection.ipynb
reduced_dataset_def.ipynb		reduced_dataset_def.ipynb
support_vector_machine.ipynb		support_vector_machine.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DM 2 Project

Dataset

About

Releases

Packages

Languages

micheleandreucci/Data-Mining-2

Folders and files

Latest commit

History

Repository files navigation

DM 2 Project

Dataset

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages