To build the Docker container, follow these steps:
- Open a terminal and navigate to the
Nsight_Compute_Tutorial/docker
directory. - Run the following command to build the container without caching:
docker build -t cuda_nsight:v0.1 --no-cache --rm --file Dockerfile.nsight .
To run the Docker container, execute the following command:
cd Nsight_Compute_Tutorial
docker run -it --rm --gpus all --cap-add=SYS_ADMIN --volume="$PWD:/workspace" cuda_nsight:v0.1
Make sure to replace cuda_nsight:v0.1
with the appropriate image name and version.
Basic commands:
- Simple analysis:
nvprof ./your_cuda_application
- Output to .nvprof file for further analysis with nvprof or nvvp tools:
nvprof --output-profile my_profile.nvprof ./your_cuda_application
- Output to .nvvp file for visualization analysis in NVIDIA Visual Profiler (nvvp):
nvprof --export-profile my_timeline_report.nvvp ./your_cuda_application
Event and Metrics
- all events
nvprof --events all ./your_cuda_application
- certain events.
nvprof --events event1,event2 ./your_cuda_application
- certain metrics and output to .nvvp
nvprof --metrics metric1,metric2 --export-profile my_timeline_report.nvvp ./your_cuda_application
https://docs.nvidia.com/nsight-systems/UserGuide/index.html
- Simple Analysis
nsys profile ./your_cuda_application
nsys profile --stats=true ./your_cuda_application
- save report as my_profile
nsys profile --stats=true -o my_profile ./your_cuda_application
nsys profile --stats=true --trace=cuda -o my_profile ./your_cuda_application
nsys profile --stats=true --trace=osrt,cuda,nvtx --show-output=true --output=my_profile ./your_cuda_application
https://docs.nvidia.com/nsight-compute/NsightCompute/index.html#
- save report as my_profile
ncu --export=my_profile ./your_cuda_application
ncu --target-processes=all --export=/workspace/report/sgemm --set=full --force-overwrite ./sgemm
- show profiling result in terminal
ncu --import my_profile.ncu-rep --page=detail
- Solution: add
--cap-add=SYS_ADMIN
in docker run.
- Download and install cuda-tool-kit with same version of device. https://developer.nvidia.com/cuda-11-8-0-download-archive
- Download JRE1.8 https://www.java.com/zh-TW/download/
- Search Visual Profiler in Host.
- Download and install cuda-tool-kit with same version of device. https://developer.nvidia.com/cuda-11-8-0-download-archive. Nsight system is included.
- Download JRE1.8 https://www.java.com/zh-TW/download/
- Download latest Nsignt Compute https://developer.nvidia.com/nsight-compute
- In Device
- install Nsight system https://docs.nvidia.com/nsight-systems/InstallationGuide/index.html
- install Nsight compute https://developer.nvidia.com/blog/using-nsight-compute-in-containers/
CUDA Profiler
CUDA Programming Guild
Learn-CUDA-Programming
CUDA Parallel Reduction