Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numeric Behaviors

Code repo for Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numeric Behaviors.

TODO: Provide better instructions and explanations

Setup

Add CUDA path to env. e.g

export CUDA_PATH=/usr/local/cuda-11.0
export PATH=$CUDA_PATH/bin:$PATH
export CUDACXX=$CUDA_PATH/bin/nvcc

Config compiler target Arch/SM

export TargetSM=80 // for A100

export TargetSM=70 // for V100

export TargetSM=75 // for Turing

Run script

cd/microbench
sh run_all.sh

You are expected to get xxx-ILPx.log files.

Note, there will be static_assert errors messages when running the scripts, because some codes have static_assert() for larger ILPs. This kind of error messages can be ignored.

References

Some codes are borrowed from Accel-Sim

citations

@ARTICLE{9931992,
  author={Sun, Wei and Li, Ang and Geng, Tong and Stuijk, Sander and Corporaal, Henk},
  journal={IEEE Transactions on Parallel and Distributed Systems}, 
  title={Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numeric Behaviors}, 
  year={2023},
  volume={34},
  number={1},
  pages={246-261},
  doi={10.1109/TPDS.2022.3217824}}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
microbench		microbench
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numeric Behaviors

Setup

Add CUDA path to env. e.g

Config compiler target Arch/SM

Run script

References

citations

About

Releases

Packages

Contributors 2

Languages

sunlex0717/DissectingTensorCores

Folders and files

Latest commit

History

Repository files navigation

Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numeric Behaviors

Setup

Add CUDA path to env. e.g

Config compiler target Arch/SM

Run script

References

citations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages