Fastformer Implementation in PyTorch

This project aims to implement the Fastformer architecture as proposed in the paper Fastformer: Additive Attention Can Be All You Need. The Fastformer architecture is designed to be a more efficient alternative to the traditional Transformer models, particularly by speeding up the attention mechanism.

Data Preparation: Preprocessing and tokenization of the AG_NEWS dataset using TorchText.
Model Architecture: Implementation of the Fastformer and traditional Transformer models.
Training: Training loop, including hyperparameter settings and optimization routines.
Evaluation: Performance metrics and comparisons between Fastformer and traditional Transformer.

Results

The notebook includes a comparison between the Fastformer and traditional Transformer models. Preliminary results suggest that Fastformer runs slightly faster but the epoch-to-epoch training time can vary.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
fast_former_implementaion.ipynb		fast_former_implementaion.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fastformer Implementation in PyTorch

Contents

Results

About

Releases

Packages

Languages

killerpanda101/FastFormer

Folders and files

Latest commit

History

Repository files navigation

Fastformer Implementation in PyTorch

Contents

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages