Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add interleave documentation #2105

Merged
merged 1 commit into from
Dec 2, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added doc/img/interleave.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
54 changes: 42 additions & 12 deletions doc/tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
* [Choose `intensity` and `worksize`](#choose-intensity-and-worksize)
* [Add more GPUs](#add-more-gpus)
* [Two Threads per GPU](two-threads-per-gpu)
* [Interleave Tuning](interleave-tuning )
* [disable comp_mode](#disable-comp_mode)
* [change the scratchpad memory pattern](change-the-scratchpad-memory-pattern)
* [Increase Memory Pool](#increase-memory-pool)
Expand Down Expand Up @@ -83,13 +84,13 @@ If you are unsure of either GPU or platform index value, you can use `clinfo` to
```
"gpu_threads_conf" :
[
{
"index" : 0, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false,
"strided_index" : true, "mem_chunk" : 2, "unroll" : 8, "comp_mode" : true
{ "index" : 0, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false,
"strided_index" : true, "mem_chunk" : 2, "unroll" : 8, "comp_mode" : true,
"interleave" : 40
},
{
"index" : 1, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false,
"strided_index" : true, "mem_chunk" : 2, "unroll" : 8, "comp_mode" : true
{ "index" : 1, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false,
"strided_index" : true, "mem_chunk" : 2, "unroll" : 8, "comp_mode" : true,
"interleave" : 40
},
],

Expand All @@ -107,19 +108,48 @@ Therefore adjust your intensity by hand.
```
"gpu_threads_conf" :
[
{
"index" : 0, "intensity" : 768, "worksize" : 8, "affine_to_cpu" : false,
"strided_index" : true, "mem_chunk" : 2, "unroll" : 8, "comp_mode" : true
{ "index" : 0, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false,
"strided_index" : true, "mem_chunk" : 2, "unroll" : 8, "comp_mode" : true,
"interleave" : 40
},
{
"index" : 0, "intensity" : 768, "worksize" : 8, "affine_to_cpu" : false,
"strided_index" : true, "mem_chunk" : 2, "unroll" : 8, "comp_mode" : true
{ "index" : 0, "intensity" : 1000, "worksize" : 8, "affine_to_cpu" : false,
"strided_index" : true, "mem_chunk" : 2, "unroll" : 8, "comp_mode" : true,
"interleave" : 40
},
],

"platform_index" : 0,
```

### Interleave Tuning

Interleave controls when a worker thread is starting to calculate a bunch of hashes
if two worker threads are used to utilize one GPU.
This option has no effect if only one worker thread is used per GPU.

![Interleave](img/interleave.png)

Interleave defines how long a thread needs to wait to start the next hash calculation relative to the last started worker thread.
To choose a interleave value larger than 50% makes no sense because than the gpu will not be utilized well enough.
In the most cases the default 40 is a good value but on some systems e.g. Linux Rocm 1.9.1 driver with RX5XX you need to adjust the value.
If you get many interleave message in a row (over 1 minute) you should adjust the value.

```
OpenCL Interleave 0|1: 642/2400.50 ms - 30.1
OpenCL Interleave 0|0: 355/2265.05 ms - 30.2
OpenCL Interleave 0|1: 221/2215.65 ms - 30.2
```

description:
```
<gpu id>|<thread id on the gpu>: <last delay>/<average calculation per hash bunch> ms - <interleave value>

```
`last delay` should gou slowly to 0.
If it goes down and than jumps to a very large value multiple times within a minute you should reduce the intensity by 5.
The `intensity value` will automatically go up and down within the range of +-5% to adjust kernel run-time fluctuations.
If `last delay` goes down to 10ms and the messages stops and repeated from time to time with delays up to 15ms you will have already a good value.

### disable comp_mode

`comp_mode` means compatibility mode and removes some checks in compute kernel those takes care that the miner can be used on a wide range of AMD/OpenCL GPU devices.
Expand Down