Tutorials on Tinygrad

A series of tutorials/study-notes that help you understand the internals of tinygrad and equip you to start contributing to it. The quickstart guide and abstraction guide are great resources, but may not be so beginner friendly, the following might be more digestable.

Recent updates:

Computer algebra study notes, this may not be directly related to tinygrad, but a lot of the optimization in tinygrad deals with computer algebra, feel free to check this out.
Abstractions in Apple's Metal framework

Fundamentals (better read in orders):

Miscellaneous topics:

Bounty explanations:

Symbolic mean

What is tinygrad?

Tinygrad stands out as a deep learning framework, akin to Pytorch, XLA, and ArrayFire, yet it distinguishes itself by being more user-friendly, swifter, and less presumptive about the specifics of your hardware.

Mirroring Pytorch's user-friendly frontend, Tinygrad enhances model training and inference efficiency by employing lazy evaluation on the GPU. This approach compiles your model into highly optimized GPU code, capable of extending across multiple devices, thereby optimizing both time and financial resources.

Moreover, it offers a significant benefit by separating the machine learning software from the computing hardware. Many ML frameworks are designed primarily for CUDA, implying an expectation of execution on Nvidia GPUs. This assumption can hinder the transition to alternative hardware in the future. Given the rapid advancements and competitive pricing strategies employed by numerous GPU manufacturers to offer comparable computing power at lower costs, ensuring your software stack is hardware-agnostic becomes an essential strategy for future-proofing.

This is where tinygrad truly shines. Our approach involves compiling machine learning models into a highly optimized Intermediate Representation (IR), which we then translate directly into GPU-specific instructions. Our goal is to drill down to the lowest possible level of instruction: PTX for Nvidia, KFD for AMD, and Metal for Apple devices. By targeting the foundational layers of the stack, we not only enhance compatibility across various hardware platforms but also unlock significant performance improvements. Additionally, this strategy leads to enhanced system stability and a reduction in the ongoing maintenance efforts.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
images		images
.gitignore		.gitignore
20240921_metal.md		20240921_metal.md
README.md		README.md
addingaccelerator.md		addingaccelerator.md
backends.md		backends.md
codegen.md		codegen.md
colors.md		colors.md
commandqueue.md		commandqueue.md
cuda-tensor-core-pt1.md		cuda-tensor-core-pt1.md
dotproduct.md		dotproduct.md
jit.md		jit.md
mergedim.md		mergedim.md
multigpu.md		multigpu.md
profiling.md		profiling.md
scheduleitem.md		scheduleitem.md
shapetracker.md		shapetracker.md
symbolic-mean.md		symbolic-mean.md
uops-2.md		uops-2.md
uops-doc.md		uops-doc.md
uops.md		uops.md
upcast.md		upcast.md
upcast2.md		upcast2.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tutorials on Tinygrad

What is tinygrad?

About

mesozoic-egg/tinygrad-notes

Folders and files

Latest commit

History

Repository files navigation

Tutorials on Tinygrad

What is tinygrad?

About

Resources

Stars

Watchers

Forks