Skip to content

husseinkorly/Myth-clusters

Repository files navigation

alt tag

Cluster analysis is an important tool for unsupervised learning. Determining the number of clusters K is a difficult problem. Our goal is to explore the different ways which are currently used to determine the optimal number of clusters in k-means.

###Team members:

  • Nikhila Balaji
  • Katherine Brey
  • Hussein Koprly
  • Alec McDivitt
  • Julie Persinger
  • Hunter Sipe
  • Yi Wei

####Mentor: Bharathkumar “Tiny” Ramachandra

###Objective:

  • Learn how to think about the solution to a hard problem in unsupervised learning, implement the solution, create and execute a structured work plan
  • Identify/Recognize some challenges that arise in unsupervised learning and understand why determining the optimal number of clusters is a hard problem
  • Analyze and identify the steps in the code structure in the k-means clustering implementation
  • Be able to generate 4 clustering datasets with different properties
  • Model the mathematical representation of K in all the 4 methods:
    1. Gap Statistic
    2. Elbow
    3. Information theoretic
    4. Avg Silhouette

###Tasks - Implementations:

  • Implementing K-means
  • Gap Statistic
  • Elbow method
  • Information Theoretic
  • Avg Silhouette

####Required Libraries:

install.packages('plot3D')
install.packages('scatterplot3d')
install.packages('car')
install.packages('pracma')

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages