The MNIST dataset is a benchmark in the field of computer vision and machine learning. It contains 60,000 training images and 10,000 testing images of handwritten digits. Our goal is to develop models that accurately classify these digits. We explore Logistic Regression, a Multilayer Perceptron (MLP), and the LeNet-5 Convolutional Neural Network, all implemented using PyTorch.
Handwritten digit recognition is a fundamental problem in computer vision and machine learning. In this project, we aim to develop and compare different models using PyTorch to identify handwritten digits from the MNIST dataset. Our objective is to create highly accurate digit recognition systems capable of classifying digits with high precision.
- Introduce the MNIST dataset and its significance in the field of machine learning.
- Explain the problem of handwritten digit recognition and its applications.
- Provide instructions for downloading and loading the MNIST dataset using PyTorch’s DataLoader.
- Preprocess the dataset, including normalization and data augmentation techniques.
The MNIST dataset consists of grayscale images of size 28x28 pixels, each representing a digit from 0 to 9. The dataset is split into:
- Training set: 60,000 images
- Test set: 10,000 images
A simple model for binary and multi-class classification problems. For MNIST, it treats each pixel as a feature.
A type of neural network with one or more hidden layers. For MNIST, our MLP model consists of:
- Input layer with 784 units (one for each pixel)
- One or more hidden layers with ReLU activation
- Output layer with 10 units and softmax activation
A classic CNN architecture proposed by Yann LeCun. It consists of:
- Two convolutional layers
- Two subsampling (pooling) layers
- Two fully connected layers
- Output layer with softmax activation
Contributions are welcome! If you'd like to contribute to this project, please fork the repository and create a pull request with your improvements.