-
Notifications
You must be signed in to change notification settings - Fork 232
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'master' of github.com:RRZE-HPC/likwid
- Loading branch information
Showing
56 changed files
with
15,008 additions
and
6,136 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
## Build & Install | ||
|
||
```bash | ||
export ROCM_HOME=/opt/rocm | ||
make | ||
make install | ||
``` | ||
|
||
## Test | ||
|
||
Build | ||
|
||
```bash | ||
cd test | ||
# make clean | ||
make test-topology-gpu-rocm | ||
make test-rocmon-triad | ||
make test-rocmon-triad-marker | ||
``` | ||
|
||
Run | ||
|
||
```bash | ||
export LD_LIBRARY_PATH=/home/users/kraljic/likwid-rocmon/install/lib:/opt/rocm/hip/lib:/opt/rocm/hsa/lib:/opt/rocm/rocprofiler/lib:$LD_LIBRARY_PATH | ||
export ROCP_METRICS=/opt/rocm/rocprofiler/lib/metrics.xml # for rocmon test | ||
export HSA_TOOLS_LIB=librocprofiler64.so.1 # allows rocmon to intercept hsa commands | ||
./gpu-test-topology-gpu-rocm | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
SHORT GDS Instructions | ||
|
||
EVENTSET | ||
ROCM0 ROCP_SQ_INSTS_GDS | ||
ROCM1 ROCP_SQ_WAVES | ||
|
||
METRICS | ||
GPU GDS rw insts per work-item ROCM0/ROCM1 | ||
|
||
LONG | ||
-- | ||
The average number of GDS read or GDS write instructions executed | ||
per work item (affected by flow control). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
SHORT Memory utilization | ||
|
||
EVENTSET | ||
ROCM0 ROCP_TA_TA_BUSY | ||
ROCM1 ROCP_GRBM_GUI_ACTIVE | ||
ROCM2 ROCP_SE_NUM | ||
|
||
METRICS | ||
GPU memory utilization 100*max(ROCM0,16)/ROCM1/ROCM2 | ||
|
||
LONG | ||
-- | ||
The percentage of GPUTime the memory unit is active. The result includes | ||
the stall time (MemUnitStalled). This is measured with all extra fetches | ||
and writes and any cache or memory effects taken into account. | ||
Value range: 0% to 100% (fetch-bound). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
SHORT PCI Transfers | ||
|
||
EVENTSET | ||
ROCM0 RSMI_PCI_THROUGHPUT_SENT | ||
ROCM1 RSMI_PCI_THROUGHPUT_RECEIVED | ||
|
||
|
||
METRICS | ||
Runtime time | ||
PCI sent ROCM0 | ||
PCI received ROCM1 | ||
PCI send bandwidth 1E-6*ROCM0/time | ||
PCI recv bandwidth 1E-6*ROCM1/time | ||
|
||
LONG | ||
-- | ||
Currently not usable since the RSMI_PCI_THROUGHPUT_* events require | ||
one second per call, so 2 seconds for both of them. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
SHORT Power, temperature and voltage | ||
|
||
EVENTSET | ||
ROCM0 RSMI_POWER_AVE[0] | ||
ROCM1 RSMI_TEMP_EDGE | ||
ROCM2 RSMI_VOLT_VDDGFX | ||
|
||
|
||
METRICS | ||
Power average 1E-6*ROCM0 | ||
Edge temperature 1E-3*ROCM1 | ||
Voltage 1E-3*ROCM2 | ||
|
||
LONG | ||
-- | ||
Gets the current average power consumption in watts, the | ||
temperature in celsius and the voltage in volts. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
SHORT SALU Instructions | ||
|
||
EVENTSET | ||
ROCM0 ROCP_SQ_INSTS_SALU | ||
ROCM1 ROCP_SQ_WAVES | ||
|
||
METRICS | ||
GPU SALU insts per work-item ROCM0/ROCM1 | ||
|
||
LONG | ||
-- | ||
The average number of scalar ALU instructions executed per work-item | ||
(affected by flow control). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
SHORT SFetch Instructions | ||
|
||
EVENTSET | ||
ROCM0 ROCP_SQ_INSTS_SMEM | ||
ROCM1 ROCP_SQ_WAVES | ||
|
||
METRICS | ||
GPU SFETCH insts per work-item ROCM0/ROCM1 | ||
|
||
LONG | ||
-- | ||
The average number of scalar fetch instructions from the video memory | ||
executed per work-item (affected by flow control). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
SHORT ALU stalled by LDS | ||
|
||
EVENTSET | ||
ROCM0 ROCP_SQ_WAIT_INST_LDS | ||
ROCM1 ROCP_SQ_WAVES | ||
ROCM2 ROCP_GRBM_GUI_ACTIVE | ||
|
||
METRICS | ||
GPU ALD stalled 100*ROCM0*4/ROCM1/ROCM2 | ||
|
||
LONG | ||
-- | ||
The percentage of GPUTime ALU units are stalled by the LDS input queue | ||
being full or the output queue being not ready. If there are LDS bank | ||
conflicts, reduce them. Otherwise, try reducing the number of LDS | ||
accesses if possible. | ||
Value range: 0% (optimal) to 100% (bad). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
SHORT GPU utilization | ||
|
||
EVENTSET | ||
ROCM0 ROCP_GRBM_COUNT | ||
ROCM1 ROCP_GRBM_GUI_ACTIVE | ||
|
||
|
||
METRICS | ||
GPU utilization 100*ROCM1/ROCM0 | ||
|
||
|
||
LONG | ||
-- | ||
This group reassembles the 'GPUBusy' metric provided by RocProfiler. | ||
We should add, that we can select the GPUBusy metric directly and the | ||
calculations are done internally in case the metric formula changes. |
Oops, something went wrong.