-
Notifications
You must be signed in to change notification settings - Fork 980
Documentation
Matthew Nicely edited this page Dec 11, 2022
·
9 revisions
CUTLASS is described in the following documents and the accompanying Doxygen documentation.
- Quick Start Guide - build and run CUTLASS
- Functionality - summarizes functionality available in CUTLASS
- Efficient GEMM in CUDA - describes how GEMM kernels may be implemented efficiently in CUDA
- GEMM API - describes the CUTLASS GEMM model and C++ template concepts
- Implicit GEMM Convolution - describes 2-D and 3-D convolution in CUTLASS
- Code Organization - describes the organization and contents of the CUTLASS project
- Terminology - describes terms used in the code
- Programming Guidelines - guidelines for writing efficient modern CUDA C++
- Fundamental types - describes basic C++ classes used in CUTLASS to represent numeric quantities and arrays
- Layouts - describes layouts of matrices and tensors in memory
- Tile Iterators - describes C++ concepts for iterating over tiles of matrices in memory
- CUTLASS Profiler - command-line driven profiling application
- CUTLASS Utilities - additional templates used to facilitate rapid development