Skip to content
View james92kj's full-sized avatar

Block or report james92kj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fully open reproduction of DeepSeek-R1

Python 19,623 1,665 Updated Feb 13, 2025

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Python 533 29 Updated Dec 9, 2024

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 2,621 197 Updated Feb 7, 2025

Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.

Python 4,418 357 Updated Jan 17, 2025

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python 15 12 Updated Jan 31, 2025

100 days of building Cuda kernels!

Cuda 190 13 Updated Feb 13, 2025

This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mastering CUDA programming. Whether you're just starting or look…

226 19 Updated Feb 12, 2025

Problem statements on System Design and Software Architecture as part of Arpit's System Design Masterclass

Python 2,258 489 Updated Dec 8, 2023

🔥Highlighting the top ML papers every week.

10,821 654 Updated Feb 10, 2025

Explanation to key concepts in ML

7,453 589 Updated Feb 13, 2025

Distribute and run LLMs with a single file.

C++ 21,687 1,135 Updated Jan 30, 2025

A small self-contained alternative to readline and libedit

C 3,874 676 Updated Aug 12, 2024

Create beautiful terminal-based code tutorials with syntax highlighting and interactive navigation.

Python 478 8 Updated Feb 12, 2025

Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch

Python 1,047 84 Updated Feb 12, 2025

Official repository for our work on micro-budget training of large-scale diffusion models.

Python 1,231 48 Updated Jan 12, 2025

A Reinforcement Learning agent that learns how to to solve maze missions in Minecraft.

Python 242 8 Updated Aug 3, 2023

Minimalistic 4D-parallelism distributed training framework for education purpose

Python 715 54 Updated Feb 11, 2025
Jupyter Notebook 416 27 Updated Oct 18, 2024

What would you do with 1000 H100s...

Jupyter Notebook 995 61 Updated Jan 10, 2024

The repository will contain a list of projects which we will work on while reading the books of Natural Language Processing & Transformers.

Jupyter Notebook 71 19 Updated Nov 12, 2023

LLM inference in C/C++

C++ 74,167 10,701 Updated Feb 13, 2025

Minimalistic large language model 3D-parallelism training

Python 1,445 147 Updated Feb 12, 2025

All the resources you need to get to Senior Engineer and beyond

14,418 1,301 Updated Dec 31, 2024

Building GPT ...

Jupyter Notebook 17 1 Updated Dec 1, 2024

My Digital Palace - A Personal Journal for Reflection - A place to store all my thoughts

Python 48 11 Updated Feb 13, 2025

LLM101n: Let's build a Storyteller

31,674 1,722 Updated Aug 1, 2024

DevOps Roadmap for 2025. with learning resources

13,588 2,210 Updated Feb 12, 2025

Modeling, training, eval, and inference code for OLMo

Python 5,169 539 Updated Feb 14, 2025
Next
Showing results