vit

Here are 340 public repositories matching this topic...

lukas-blecher / LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.

python machine-learning ocr latex deep-learning image-processing pytorch dataset transformer vit image2text im2text im2latex im2markup math-ocr vision-transformer latex-ocr

Updated Dec 5, 2024
Python

cmhungsteve / Awesome-Transformer-Attention

Star

An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

computer-vision deep-learning transformers transformer awesome-list vit papers attention-mechanism attention-mechanisms self-attention transformer-architecture transformer-models detr vision-transformer transformer-cv transformer-with-cv transformer-awesome visual-transformer

Updated Jul 30, 2024

towhee-io / towhee

Star

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

machine-learning computer-vision pipeline image-processing embeddings transformer video-processing feature-extraction convolutional-networks vit feature-vector image-retrieval unstructured-data embedding-vectors milvus vision-transformer towhee llm

Updated Oct 18, 2024
Python

hila-chefer / Transformer-Explainability

Star

[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.

deep-learning vit bert perturbation attention-visualization bert-model explainability attention-matrix vision-transformer transformer-interpretability visualize-classifications cvpr2021

Updated Jan 24, 2024
Jupyter Notebook

open-compass / VLMEvalKit

Star

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks

computer-vision evaluation pytorch gemini openai vqa vit gpt multi-modal clip claude openai-api gpt4 large-language-models llm chatgpt llava qwen gpt-4v

Updated Dec 20, 2024
Python

roboflow / inference

Star

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

Updated Dec 21, 2024
Python

BR-IDL / PaddleViT

Star

🤖 PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

computer-vision deep-learning detection cv transformer gan classification segmentation object-detection mlp vit semantic-segmentation encoder-decoder paddlepaddle

Updated Sep 7, 2022
Python

yitu-opensource / T2T-ViT

Star

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

vit vision-transformer t2t-transformer

Updated Oct 27, 2023
Jupyter Notebook

Yangzhangcst / Transformer-in-Computer-Vision

Star

A paper list of some recent Transformer-based CV works.

awesome computer-vision deep-learning transformer vit papers detr transformer-cv transformer-awesome

Updated Dec 21, 2024

sail-sg / Adan

Star

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

Updated Jul 2, 2024
Python

v-iashin / video_features

Star

Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.

Updated Oct 26, 2024
Python

chinhsuanwu / mobilevit-pytorch

Star

A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer"

vit mobilenetv2 vision-transformer mobilevit

Updated Jan 16, 2022
Python

zgcr / SimpleAICV_pytorch_training_examples

Star

SimpleAICV:pytorch training and testing examples.

sam pytorch resnet vit darknet dino mae kd retinanet deeplabv3plus yolact fcos centernet u2net solov2 ttfnet repvgg regnetx segment-anything

Updated Nov 25, 2024
Jupyter Notebook

vatz88 / FFCSonTheGo

Star

FFCS course registration made hassle free for VITians. Search courses and visualize the timetable on the go!

javascript timetable ffcs vit hacktoberfest vellore

Updated Nov 11, 2024
JavaScript

gupta-abhay / pytorch-vit

Star

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

transformers image-classification image-recognition vit vision-transformer hybrid-vit

Updated Oct 1, 2021
Python

PaddlePaddle / PASSL

Star

PASSL包含 SimCLR，MoCo v1/v2，BYOL，CLIP，PixPro，simsiam, SwAV, BEiT，MAE 等图像自监督算法以及 Vision Transformer，DEiT，Swin Transformer，CvT，T2T-ViT，MLP-Mixer，XCiT，ConvNeXt，PVTv2 等基础视觉算法

deep-learning vit clip paddle pvt mae moco self-supervised-learning cvt simclr beit vision-transformer deit pixpro moco-v2 swav swin-transformer xcit convnext

Updated Aug 1, 2023
Python

i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (channel attention) and *temporal dimension*. iii. Common spatial pattern (CSP), an efficient feature enhancement method, realized with Python.

deep-learning eeg transformer attention vit attention-mechanism physiological-signals common-spatial-pattern eeg-classification