Skip to content

Apply some common classifiers in the popular dataset Fashion MNIST

Notifications You must be signed in to change notification settings

dimcel/Fashion_MNIST_Playground

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MNIST Image Classification with Popular Classifiers

Overview

This repository demonstrates the use of popular machine learning classifiers to classify the MNIST dataset, a collection of handwritten digits. The MNIST dataset is widely used as a benchmark in the field of machine learning, making it an excellent starting point for exploring various classifiers.

Classifiers Used

The following classifiers are implemented and compared in this repository:

  1. Logistic Regression: A linear classifier that is simple yet effective.

  2. k-Nearest Neighbors (k-NN): A non-parametric method based on the similarity of data points.

  3. Support Vector Machine (SVM): A powerful classifier that works well for both linear and non-linear data.

  4. Varius Ensemblers: Ensemble learning method that builds a multitude of decision trees.

Dataset

The MNIST dataset consists of 28x28 pixel grayscale images of handwritten digits (0 through 9). It is a classic dataset for introducing image classification concepts.

Download the data: https://www.kaggle.com/datasets/zalando-research/fashionmnist

Getting Started

  1. Install Dependencies:
    pip install -r requirements.txt

Instructions in Each Notebook

Each notebook contains detailed instructions and explanations for the following steps:

  • Loading and Preprocessing the MNIST dataset.
  • Implementing and training the respective classifier.
  • Evaluating the model's performance.
  • Fine-tuning and optimizing parameters (where applicable).

Results and Discussion

The results and comparative analysis of each classifier are provided in the notebooks. Feel free to experiment with different hyperparameters, preprocessing techniques, or even explore additional classifiers.

Outcome

In the course of our analysis, the Support Vector Machine (SVM) with a radial basis function (rbf) kernel emerged as the top-performing classifier, aligning with expectations. This achievement was further enhanced by leveraging Principal Component Analysis (PCA) for feature selection/engineering. By retaining 95% of the variance in the data, we managed to significantly improve training speed compared to using all features, demonstrating the effectiveness of dimensionality reduction in enhancing the SVM model's performance.

Conclusion

This repository serves as a practical guide for implementing and comparing various classifiers on the MNIST dataset. This project offers a hands-on experience with popular models.

About

Apply some common classifiers in the popular dataset Fashion MNIST

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published