Skip to content

killerpanda101/FastFormer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Fastformer Implementation in PyTorch

This project aims to implement the Fastformer architecture as proposed in the paper Fastformer: Additive Attention Can Be All You Need. The Fastformer architecture is designed to be a more efficient alternative to the traditional Transformer models, particularly by speeding up the attention mechanism.

Contents

  1. Data Preparation: Preprocessing and tokenization of the AG_NEWS dataset using TorchText.

  2. Model Architecture: Implementation of the Fastformer and traditional Transformer models.

  3. Training: Training loop, including hyperparameter settings and optimization routines.

  4. Evaluation: Performance metrics and comparisons between Fastformer and traditional Transformer.

Results

The notebook includes a comparison between the Fastformer and traditional Transformer models. Preliminary results suggest that Fastformer runs slightly faster but the epoch-to-epoch training time can vary.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published