Skip to content

Latest commit

 

History

History
22 lines (15 loc) · 754 Bytes

README.md

File metadata and controls

22 lines (15 loc) · 754 Bytes

A Simple Vision Transformer (ViT) Implementation in Tinygrad

This repository contains a minimalist implementation of the Vision Transformer (ViT) model using tinygrad.

What is a ViT?

The Vision Transformer (ViT) is a model introduced by Google Research that applies transformer architecture to image classification tasks. Unlike traditional convolutional neural networks (CNNs), ViT divides an image into patches and processes them as sequences, similar to words in natural language processing.

Installation

To get started, clone this repository and install the required dependencies:

git clone https://github.com/EthanBnntt/tinygrad-vit.git
cd tinygrad-vit
pip install -r requirements.txt

Usage

python train.py