Skip to content

This repository contains code and analysis for a homework assignment on recommendation systems and clustering algorithms in Python. Implements techniques like minhash, LSH, feature engineering, dimensionality reduction, K-means and DBSCAN clustering.

Notifications You must be signed in to change notification settings

AmbarChatterjee/ADM_HW4_Group3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ADM_HW4_Group3 - Recommendation systems and clustering everywhere

This repository contains code and analysis for the 4th homework assignment for the Algorithmic Methods of Data Mining course.

Contents

The repository contains the following key files:

  • main.ipynb: Main Jupyter notebook containing implementation and analysis for the recommendation system and clustering tasks
  • CommandLine.sh: Bash script to execute command line tasks
  • SS.png: Screenshot of command line output
  • vodclickstream_uk_movies_03.csv: CSV file containing the dataset used

Usage

The main tasks are implemented in main.ipynb. This covers:

  • Recommendation system using minhash and LSH
  • User clustering with feature engineering, dimensionality reduction, K-means and DBSCAN
  • The command line question and the algorithmic question

The command line question is executed via CommandLine.sh and output is shown in SS.png.

Requirements

The code requires Python 3 and standard data science libraries like Pandas, NumPy, Scikit-Learn, etc.

The bash script assumes a Linux/Unix environment with common command line utilities like grep, wc, etc.

Authors

  • Ambar Chatterjee
  • Elias Antoun
  • Sofia Noemi Crobeddu
  • Damian Zeller

Course project completed as part of the Algorithmic Methods of Data Mining course.

Acknowledgements

  • Course instructors
  • Dataset provided by: [source]

About

This repository contains code and analysis for a homework assignment on recommendation systems and clustering algorithms in Python. Implements techniques like minhash, LSH, feature engineering, dimensionality reduction, K-means and DBSCAN clustering.

Topics

Resources

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •