Deep Embedded Clustering (DEC)

Introduction

Deep embedded clustering is a machine learning technique that combines the strengths of deep learning and clustering to automatically group data points based on their intrinsic similarities. Here's the gist:

Deep learning extracts high-dimensional, meaningful representations (embeddings) from your data, capturing underlying patterns and relationships. Imagine it as finding a more informative way to describe each data point beyond its raw features. Clustering groups similar data points together based on these embeddings. Think of it as organizing your data points into meaningful categories based on their extracted features.

A. The original data points can be anything. In this example, they are images of handwritten digits.

B. The embeddings are lower-dimensional representations of the data points. They are created by a deep neural network.

C. The clusters are formed by grouping together similar embeddings. In this example, the clusters correspond to the different digits.

Deep embedded clustering is a powerful technique that can be used to cluster data points based on their similarities. It is a valuable tool for data analysis and machine learning.

Training

usage: train.py [-h] [--batch-size BATCH_SIZE] [--gpu-index GPU_INDEX]

optional arguments:
  -h, --help            show this help message and exit
  --batch-size BATCH_SIZE
                        Train Batch Size
  --gpu-index GPU_INDEX
                        GPU Index Number

Visualized

The inference.py returns the latent representation ($z$), and exports the z.tsv, meta.tsv (label information).

usage: inference.py [-h] [--gpu-index GPU_INDEX]

optional arguments:
  -h, --help            show this help message and exit
  --gpu-index GPU_INDEX
                        GPU Index Number

For visualization, we use t-SNE by importing z.tsv, meta.tsv into Tensorboard. The visualization using MNIST shows as follow.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
cluster.png		cluster.png
datasets.py		datasets.py
inference.py		inference.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Embedded Clustering (DEC)

Introduction

Training

Visualized

About

Releases

Packages

Languages

lamthienphuc/Clustering

Folders and files

Latest commit

History

Repository files navigation

Deep Embedded Clustering (DEC)

Introduction

Training

Visualized

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages