Implementation of various concepts around Digital Media (Image/Video) Processing (DMP) topics.
- Introduction
How to read/plot an image using matplotlib package
Image properties: color, dtype, depth, resolution, ... - Basic Modifications
Crop, Flip, Circular Shift, Rotation - Interpolations
Nearest Neighbor, BiLinear, BiCubic, Lanczos interpolation - Intensity Transformation
Negative, Logarithm, Power-Law (Gamma correction), Piecewise-Linear Transform - Histogram
Histogram Stretching, Shrinking, Sliding
Global Histogram Equalization
Local Histogram Equalization (Adaptive Histogram Equalization)
Adaptive Contrast Enhancement (ACE)
Histogram Matching (Specification) - Convolution
1D Convolution
2D Convolution (GrayScale/RGB image) - Fourier Transform
Basis vectors(1D)/images(2D)
Forward/Backward Fourier Transform
Fast Fourier Transform (FFT)
Ideal Low-Pass filter
Cardinal Sine (sinc) filter
Ringing Effect
Shift, Rotation, Flip effect in frequency domain
Image sharpening using a gaussian high-pass filter
Periodic noise removal - Cosine Transform
Basis vectors(1D)/images(2D)
Forward/Backward Cosine Transform
Compression Effect (DFT vs DCT)
Zonal Masking - Quality Assessment
Mean Squared Error (MSE)
Signal-to-Noise Ratio (SNR)
Peak Signal-to-Noise Ratio (PSNR)
Structural Similarity Index (SSIM)
Root Mean Square Error (RMSE)
Mean Absolute Error (MAE)
Mean Structural Similarity Index (MSSIM)
Visual Information Fidelity (VIF)
Feature Similarity Index (FSIM)
Multi-Scale Structural Similarity Index (MS-SSIM) - Steganography
Steganography using least significant bits - JPEG codec
JPEG Encoder & Decoder - MPEG codec
MPEG Encoder & Decoder - Image Registration
Aligning multiple images into a common coordinate system - Image Stitching
Combining multiple images to create a single larger image [Panorama] - Optical Flow
Optical Flow using Lucas-Kanade & Farneback algorithms
Implementation of several concepts utilized in the main notebooks
- Padding and Convolution
Provides utility functions for 1D and 2D convolution, along with flexible padding options. - DCT Implementation
Discrete Cosine Transform (DCT) implementation for 1D and 2D signals. - DFT Implementation
Implementations of 1D and 2D Discrete Fourier Transform (DFT) and related functions. - Filter Functions Implementation
Implementations of various 2D filter functions including ideal, Gaussian, sinc, Butterworth, Chebyshev, Bessel, and block masks. - JPEG Codec Implementation
Class-based implementation of JPEG encoder and decoder using discrete cosine transform (DCT) and quantization. - MPEG Codec Implementation
Class-based implementation of MPEG encoder and decoder using discrete cosine transform (DCT) and quantization. - Quality Assessment Metrics
Implementation of famous metrics e.g. MSE, SNR, PSNR, SSIM, RMSE, MAE, ... - Spatial Modifications
Implementation of concepts in spatial domain e.g. interpolations and histograms - Steganography
Implementation of a simple steganography method using Least Significant Bits (LSB)
- Programming Fundamentals
- Proficiency in Python (data types, control structures, functions, etc.).
- My Python Workshop: github.com/mr-pylin/python-workshop
- Experience with libraries like NumPy, Matplotlib and OpenCV.
- My NumPy Workshop: github.com/mr-pylin/numpy-workshop
- My MatPlotLib Workshop: Coming Soon
- My OpenCV Workshop: Coming Soon
- Proficiency in Python (data types, control structures, functions, etc.).
- Digital Signal Processing Knowledge
- Mathematics for Image Processing
- Linear Algebra: Understanding of vectors, matrices, and matrix operations, crucial for transformations, convolutions, and Fourier analysis.
- Linear Algebra Review and Reference written by Zico Kolter
- Notes on Linear Algebra written by Peter J. Cameron
- MATH 233 - Linear Algebra I Lecture Notes written by Cesar O. Aguilar
- Probability & Statistics: Probability distributions, mean/variance, etc.
- Linear Algebra: Understanding of vectors, matrices, and matrix operations, crucial for transformations, convolutions, and Fourier analysis.
- 04: Adaptive Contrast Enhancement (ACE)
- 04: Histogram Matching (Specification)
- 14: Sparse Optical Flow using Lucas-Kanade
This project was developed using Python v3.12.3
. If you encounter issues running the specified version of dependencies, consider using this specific Python version.
You can install all dependencies listed in requirements.txt
using pip.
pip install -r requirements.txt
- Open the root folder with VS Code
- Windows/Linux:
Ctrl + K
followed byCtrl + O
- macOS:
Cmd + K
followed byCmd + O
- Windows/Linux:
- Open
.ipynb
files using Jupyter extension integrated with VS Code - Allow VS Code to install any recommended dependencies for working with Jupyter Notebooks.
- Note: Jupyter is integrated with both VS Code & Google Colab
- ffmpeg & ffprobe:
- ffmpeg is a Swiss Army knife for media, converting and manipulating audio and video files in a wide range of formats.
- Link: github.com/BtbN/FFmpeg-Builds
- YUV4MPEG Videos:
- Derf's video collection provides uncompressed YUV4MPEG clips for testing video codecs.
- Link: media.xiph.org/video/derf
- Video Quality Measurement Tool (VQMT):
- It is a software program designed to analyze the quality of digital video and images.
- Link: compression.ru/video/quality_measure
- yuv-player:
- Lightweight YUV player which supports various YUV format.
- Link: github.com/Tee0125/yuvplayer
- H.264 (AVC) codec:
- The most widely used video compression standard, offering high quality at low bitrates.
- Link: vcgit.hhi.fraunhofer.de/jvet/JM
- H.265 (HEVC) codec:
- Successor to H.264, offering even better compression for even higher quality or lower bitrates.
- Link: vcgit.hhi.fraunhofer.de/jvet/HM
- H.266 (VVC) codec:
- The latest video compression standard, offering significant efficiency improvements over H.265 for high-resolution streaming and future video applications.
- Encoder link: github.com/fraunhoferhhi/vvenc
- Decoder link: github.com/fraunhoferhhi/vvdec
- NumPy
- A fundamental package for scientific computing in Python, providing support for arrays, matrices, and a large collection of mathematical functions.
- Official site: numpy.org
- MatPlotLib:
- A comprehensive library for creating static, animated, and interactive visualizations in Python
- Official site: matplotlib.org
- OpenCV:
- A powerful library for computer vision and image processing, supporting real-time operations on images and videos in Python and other languages.
- Official site: opencv.org
Any mistakes, suggestions, or contributions? Feel free to reach out to me at:
I look forward to connecting with you! 🏃♂️
- Digital Image Processing by Gonzalez & Woods:
- The images located in the ./assets/images/dip_3rd/ folder are licensed under the table below.
- Resources are available for
personal educational or research purposes
at imageprocessingplace.com.
Image | Copyright Owner | Address |
---|---|---|
CH02_Fig0222(b)(cameraman).tif | Massachusetts Institute of Technology | MIT.edu |
CH03_Fig0309(a)(washed_out_aerial_image).tif | NASA | nasa.gov |
CH03_Fig0326(a)(embedded_square_noisy_512).tif | - | imageprocessingplace.com |
CH03_Fig0354(a)(einstein_orig).tif | Public domain | - |
CH06_Fig0638(a)(lenna_RGB).tif | Public domain | - |
CH06_FigP0606(color_bars).tif | - | - |
- Third-Party Assets:
- Additional images located in ./assets/images/third_party/ are used with permission or according to their original licenses.
- Attributions and references to original sources are included in the code where these images are used.
Image | Copyright Owner | Address |
---|---|---|
nature_1.jpg | - | pexels.com |
nature_2.jpg | - | pexels.com |
- Miscellaneous assets:
Image | Copyright Owner | Address |
---|---|---|
keyboard_1.jpg | Amirhossein Heydari | github.com/mr-pylin |
keyboard_2.jpg | Amirhossein Heydari | github.com/mr-pylin |
test.tif | Amirhossein Heydari | github.com/mr-pylin |
This project is licensed under the Apache License 2.0.
You are free to use, modify, and distribute this code, but you must include copies of both the LICENSE and NOTICE files in any distribution of your work.
Note: Assets in the above tables may have their own licenses