Improved Image Compression with KMeans

Image compression is a technique that reduces the size of an image file without significantly affecting its quality. One of the methods for image compression is k-means clustering, which partitions the image pixels into k clusters based on their colors. Each pixel is then assigned to the cluster with the closest mean color, and the image is reconstructed using only k colors. This reduces the amount of information needed to store and transmit the image, while preserving its main features. This method of compression has a downside as it compresses an image indiscriminately ie, without taking care of regions with high or low detail(spatial frequencies).

In this work, I seek to enhance the quality of image compression based on KMeans clustering. The main idea is to segment the image into regions that differ in their level of detail and apply different compression rates to each region. This way, the regions with high detail are preserved better, while the regions with low detail are compressed more efficiently.

1. Models

Compression based on Eigenvalues of Covariant matrices
Compression based on Total varaintion in Amplitude Space

1.1 Compression based on Eigenvalues of Covariant matrices

This method involves the following steps:

Split the image into many rectangular blocks or chunks
Compute the eigenvalues of the covariance matrices of each block
Use the eigenvalues to measure the variance in each block
Classify the blocks into high-variance and low-variance regions based on the mean eigenvalue of each block
Choose the median eigenvalue as the threshold for the classification
Compress the low variance regions with $k_1$ clusters and high variance regions with $k_2$ clusters, where $k_1 < k_2$

1.2 Compression based on Total varaintion in Amplitude Space

Transform the image to grayscale and obtain its amplitude space.
Divide the image into rectangular blocks of equal size.
Compute the variance within each block using one of two possible metrics
- The average statistical variance in the amplitude within each block
- The norm of the two dimensional image gradient within each image block, $g_i(x,y)$ $$TV=|\nabla{g_i} |$$
Choose the median metric as the threshold for the classification
Compress the low variance regions with $k_1$ clusters and high variance regions with $k_2$ clusters, where $k_1 < k_2$

2. Notable Observations and Results

Test Image Used:

2.1 Variance Classification Performance

Eigenvalue Classification	Amplitude Variation Classification

Here the black pixels represents low variance regions
Both the methods appears to work well with the test image
Eigenvalue method should be generally preferred since Amplitude variation cannot account for changes in colors with same amplitude intensities

2.2 Compression Performance

No. of clusters($k_1$)	Regular KMeans	Eigenvalue Enhanced	Amp Variation Enhanced	Compression Rate
32				70%
64				72%
128				73%

Here $k2 =256$ for all cases
We do see noticeable improvements in the picture quality with the enhanced methods at the same compression rate

For quantitative measurement of compression quality, I am electing to use PSNR(Peak Signal-to-Noise Ratio) and SSIM(Structural Similarity Index).

PSNR is calculated by using the mean squared error (MSE) between the original and distorted images. A higher PSNR value indicates a higher similarity between the images.
SSIM is designed to mimic the human visual system and capture the perceptual quality of images. SSIM is calculated by using local statistics of pixel intensities in a sliding window. A higher SSIM value indicates a higher structural similarity between the images.

For the given image, PSNR and SSIM values show significant improvement going from Regular KMeans to Enhanced Models

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
Compress_img_notebook.ipynb		Compress_img_notebook.ipynb
README.md		README.md
compress_images.py		compress_images.py
compress_utils.py		compress_utils.py
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improved Image Compression with KMeans

1. Models

1.1 Compression based on Eigenvalues of Covariant matrices

1.2 Compression based on Total varaintion in Amplitude Space

2. Notable Observations and Results

2.1 Variance Classification Performance

2.2 Compression Performance

About

Releases 1

Packages

Languages

pmephin/Improved-Image-Compression-with-KMeans

Folders and files

Latest commit

History

Repository files navigation

Improved Image Compression with KMeans

1. Models

1.1 Compression based on Eigenvalues of Covariant matrices

1.2 Compression based on Total varaintion in Amplitude Space

2. Notable Observations and Results

2.1 Variance Classification Performance

2.2 Compression Performance

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages