Personal re-implementations of Machine Learning papers. Re-implementations might use different hyper-parameters / datasets / settings compared to the original paper.
Current re-implementations include:
Paper | Code | Blog |
---|---|---|
Natural language processing | ||
A Watermark for Large Language Models | Code | |
Attention is all you need | Code | Coming soon |
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | Code | Coming soon |
Language Modeling Is Compression | Code | |
Language Models are Few-Shot Learners | Code | Coming soon |
Computer Vision | ||
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale | Code | Blog |
Denoising Diffusion Probabilistic Models | Code | Blog |
Density estimation using Real NVP | Code | |
Idempotent Generative Network | Code | Blog |
ViR: Vision Retention Networks | Code | Blog |
Reinforcement Learning | ||
Proximal Policy Optimization Algorithms | Code | Blog |
Playing Atari with Deep Reinforcement Learning | Code | Coming soon |
Others | ||
Everything is Connected: Graph Neural Networks | Code | |
Fast Feedforward Networks | Code |
While this repo is a personal attempt to familiarize with the ideas down to the nitty gritty details, contributions are welcome for re-implementations that are already on the repository. In particular, I am open to discuss doubts, questions, suggestions to improve the code, and spotted mistakes / bugs. If you would like to contribute, simply raise an issue before submitting a pull request.
The code is released with the MIT license.