Skip to content

The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonathan Berant. SustaiNLP 2021).

Notifications You must be signed in to change notification settings

ag1988/top_k_attention

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Memory-efficient Transformers via Top-k Attention

This repository contains the accompanying code for the paper:

"Memory-efficient Transformers via Top-k Attention." Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonathan Berant. In SustaiNLP, 2021. [PDF]

Structure

The repository contains:

  • our implementation/benchmarking of top-k attention (in nocache_attention dir)
  • unifiedqa/T5 finetuning/inference using our top-k attention at feed-forward layers (in unifiedqa dir)

Coming Soon:

  • BERT QA model with top-k attention
  • T5 multi-head layers with top-k attention (current code is only for FF layers)

Citation

@inproceedings{gupta2021memoryefficient,
  title={Memory-efficient Transformers via Top-k Attention}, 
  author={Ankit Gupta and Guy Dar and Shaya Goodman and David Ciprut and Jonathan Berant},
  booktitle = {Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing},
  year={2021},
  publisher = {Association for Computational Linguistics}
}

About

The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonathan Berant. SustaiNLP 2021).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published