Onednn backend #1558

borg323 · 2021-05-11T10:22:32Z

This is the long awaited onednn backend (formerly dnnl, formerly mkl-dnn). To build pass -Donednn=true -Ddnnl_dir=path/to/dnnl/library to meson. This works with both cpu and (intel) gpu. For gpu you will need to build the dnnl library yourself (or ask me for a dll).

For best performance use just one search thread when running on a cpu as a second search thread interferes with the onednn computing threads (onednn on a cpu uses all available cores by default).

There are several backend options, the most important are:

option	values	default	comment
gpu	empty or integer	empty	select gpu to use, empty for cpu
winograd	empty, true or false	empty	Set to true to use Winograd 3x3 convolution on cpu, false for direct convolution, empty to let the library chose. Currently dnnl after v2.0.0 may get this wrong, and is not supported on all processors so may exit with a not very informative error.
fp16	true or false	true on gpu, false on cpu	Use fp16 (or bf16 if you cpu supports it) for computation.
batch	integer	64 for fp16, 32 for fp32	The minimum batch size to use, as the library prepares kernels on startup. Set to 0 for dynamic kernel recompilation on every batch (not recommended).
steps	integer	2	Prepare kernels for this many multiples of batch size.
threads	empty or integer	empty	Number of cpu threads to use, empty to let the library decide.

)

borg323 · 2021-05-11T17:22:28Z

This is dnnl 1.8.0 compiled with both gpu and cpu support: dnnl.dll.zip

Example command line for benchmark:
./lc0 benchmark -w 703810.pb.gz --threads=1 --backend=onednn --backend-opts=gpu=0

aochoam · 2022-03-12T13:44:11Z

@borg323 Result of the Sanity checking the dx12 driver.
_
| _ | |
|_ |_ |_| v0.28.2 built Dec 13 2021
Detected 4 core(s) and 8 thread(s) in 1 group(s).
Group 0 has 4 core(s) and 8 thread(s).
Found pb network file: C:\Arena\Engines\lc0_dx12/771479.pb.gz
Creating backend [check]...
Working backend set to dx12.
Reference backend set to eigen.
Creating backend [dx12]...
Creating backend [eigen]...
Using Eigen version 3.3.7
Eigen max batch size is 256.
Check mode: check only with relative tolerance 1.0e-04, absolute tolerance 5.0e-01.
Check rate: 100%.

Position: 1/1 rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
*** ERROR check failed for a batch of 1 policy incorrect (but value ok).
*** ERROR check failed for a batch of 19 policy incorrect (but value ok).
Benchmark time 6218 ms, 20 nodes, 21 nps, move b1c3
*** ERROR check failed for a batch of 20 policy incorrect (but value ok).
Benchmark time 6260 ms, 21 nodes, 21 nps, move b1a3
Benchmark time 10007 ms, 21 nodes, 4 nps, move b1a3
bestmove b1a3
*** ERROR check failed for a batch of 144 both value and policy incorrect.
*** ERROR check failed for a batch of 256 both value and policy incorrect.

===========================
Total time (ms) : 18655
Nodes searched : 421
Nodes/second : 23
Presione una tecla para continuar . . .

aochoam · 2022-03-12T13:45:02Z

This version does not work for me.

borg323 and others added 30 commits October 27, 2020 19:04

initial dnnl backend

1c79904

improve memory propagation

b8bf227

rename option to jit_cache

7c8b5c0

use meson 0.55.3 on appveyor since 0.56.0 is broken (LeelaChessZero#1457

e39282f

)

improve SELayer

04d1262

cleanup

1ad80de

fix? for avx512

7df6f85

some comments and cleanup

5787390

support for fp16 and gpu

48dd609

allow using fixed batch size (default 32)

9102811

initialize layers early

e978382

cache reorder primitives

f34de95

avoid reshapes in FCLayer

35329ec

pass dnnl::memory to LoadWeights()

aed08b0

add threads backend option

73c6872

fuse 2 inner product primitives

2f91b5f

cleanup

d1bb085

rename backend to onednn

0cdae72

Merge branch 'master' into onednn_back

8b78865

convolution algorithm selection

5dd4cd7

small fix

93d2ff4

binary post-op for gpu only

c65f898

add comment

1267137

use multiple batch size steps

b48279d

cleanup

7b0aedf

sane defaults

8a3a309

allow winograd only on cpu

696666e

add to appveyor

2348880

fix typo

d70d758

fix copyright years

78bb52e

more fine grained locking

66ce015

Tilps approved these changes Jun 16, 2021

View reviewed changes

Merge branch 'master' into onednn_back

4039ad6

borg323 merged commit 6b1f83e into LeelaChessZero:master Jun 16, 2021

borg323 deleted the onednn_back branch June 16, 2021 11:42

borg323 mentioned this pull request Mar 3, 2022

Lc0 0.28.2 DirectX 12 - Analysis Bug #1696

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Onednn backend #1558

Onednn backend #1558

borg323 commented May 11, 2021 •

edited

Loading

borg323 commented May 11, 2021 •

edited

Loading

aochoam commented Mar 12, 2022

aochoam commented Mar 12, 2022

Onednn backend #1558

Onednn backend #1558

Conversation

borg323 commented May 11, 2021 • edited Loading

borg323 commented May 11, 2021 • edited Loading

aochoam commented Mar 12, 2022

aochoam commented Mar 12, 2022

borg323 commented May 11, 2021 •

edited

Loading

borg323 commented May 11, 2021 •

edited

Loading