ThreadSanitizer is an effective approach to locate data races in concurrent code bases. This is an experimental extension of the OCaml compiler to add ThreadSanitizer support to OCaml 5.0.
-
OCaml ThreadSanitizer support inherits the portability limitations of ThreadSanitizer: it is not available on Windows.
-
TSan investigates a particular execution and therefore will not detect races in code paths that are not visited.
-
TSan may still report false positives in some rare cases (see section 6.4 of the WBIA ’09 paper below).
-
Currently, ThreadSanitizer detection only works for native x86_64 executables; not bytecode programs, nor other architectures.
For Linux, you first need to install libunwind stack unwinding library on your system. Most Linux distributions provide a package.
Then you will need to work in a dedicated opam switch:
opam update opam switch create 5.0.0+tsan
Building the switch may take a bit longer than usual due to some unavoidable ThreadSanitizer-incurred slowdowns in parts of the process. Expect a few minutes of build time.
On this switch, your programs will now be compiled with ThreadSanitizer
instrumentation, including the packages you may install with opam
.
If you link against C files, it is best to instrument them to detect data races
in them as well, by passing -fsanitize=thread
to GCC or clang. Note that if
you use Dune, this setting can be added to a Dune profile by adding e.g. the
following to your dune-workspace
:
(env (tsan (c_flags -fsanitize=thread)))
Then you only need to select this profile, e.g. dune exec --profile tsan
myexecutable.exe
or dune runtest --profile tsan
.
The simplest of data races can be created by having two accesses to the same mutable location, one of them being a write:
type t = { mutable x : int }
let v = { x = 0 }
let () =
let t1 = Domain.spawn (fun () -> v.x <- 10; Unix.sleep 1) in
let t2 = Domain.spawn (fun () -> ignore v.x; Unix.sleep 1) in
Domain.join t1;
Domain.join t2
Running this program with ThreadSanitizer instrumentation will output a report looking like this:
================== WARNING: ThreadSanitizer: data race (pid=4170290) Read of size 8 at 0x7f072bbfe498 by thread T4 (mutexes: write M0): #0 camlSimpleRace__fun_524 /tmp/simpleRace.ml:7 (simpleRace.exe+0x43dc9d) #1 camlStdlib__Domain__body_696 /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/stdlib/domain.ml:202 (simpleRace.exe+0x47b5dc) #2 caml_start_program ??:? (simpleRace.exe+0x4f51c3) #3 caml_callback_exn /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/callback.c:168 (simpleRace.exe+0x4c2b93) #4 caml_callback /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/callback.c:256 (simpleRace.exe+0x4c36e3) #5 domain_thread_func /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/domain.c:1093 (simpleRace.exe+0x4c6ad1) Previous write of size 8 at 0x7f072bbfe498 by thread T1 (mutexes: write M1): #0 camlSimpleRace__fun_520 /tmp/simpleRace.ml:6 (simpleRace.exe+0x43dc45) #1 camlStdlib__Domain__body_696 /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/stdlib/domain.ml:202 (simpleRace.exe+0x47b5dc) #2 caml_start_program ??:? (simpleRace.exe+0x4f51c3) #3 caml_callback_exn /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/callback.c:168 (simpleRace.exe+0x4c2b93) #4 caml_callback /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/callback.c:256 (simpleRace.exe+0x4c36e3) #5 domain_thread_func /home/olivier/.opam/5.0.0+tsan/.opam-switch/build/ocaml-variants.5.0.0+tsan/runtime/domain.c:1093 (simpleRace.exe+0x4c6ad1) Mutex M0 (0x000000567ad8) created at: #0 pthread_mutex_init /home/olivier/other_projects/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1316 (libtsan.so.0+0x3cafb) [...] SUMMARY: ThreadSanitizer: data race /tmp/simpleRace.ml:7 in camlSimpleRace__fun_524 ================== ThreadSanitizer: reported 1 warnings
If the mutable field is replaced with an Atomic
reference, the warning
disappears:
let v = Atomic.make 0
let () =
let t1 = Domain.spawn (fun () -> Atomic.set v 10; Unix.sleep 1) in
let t2 = Domain.spawn (fun () -> ignore (Atomic.get v); Unix.sleep 1) in
Domain.join t1;
Domain.join t2
Synchronizing the two accesses above by busy-waiting on an atomic boolean will be detected by ThreadSanitizer and no data race will be reported:
type t = { mutable x : int }
let v = { x = 0 }
let v_modified = Atomic.make false
let () =
let t1 =
Domain.spawn (fun () ->
v.x <- 10;
Atomic.set v_modified true;
Unix.sleep 1)
in
let t2 =
Domain.spawn (fun () ->
while not (Atomic.get v_modified) do () done;
ignore v.x;
Unix.sleep 1)
in
Domain.join t1;
Domain.join t2
More efficiently, such synchronization can be implemented using a
Mutex.t
with the same result.
There are two components to ThreadSanitizer (TSan): 1. A runtime library to track accesses to shared data and report races 2. A compiler instrumentation feature that inserts calls to that runtime library.
Internally the runtime library associates with each word of application memory at least 2 "shadow words". Each shadow word contains information about a recent memory access to that word, including a "scalar clock". Those clocks serve to establish a happens-before (HB) relation, i.e. an event orderings that we are certain of.
This information is maintained as a "shadow state" in a separate memory region, and updated at every (instrumented) memory access. A data race is reported every time two memory accesses are made to overlapping memory regions, and: - one of them is a write, and - there is no established happens-before relation between them. More information about TSan’s algorithm on their wiki.
ThreadSanitizer for OCaml incurs a slowdown and increases memory consumption. Preliminary benchmarks show a slowdown in the range 7x-13x and a memory consumption increase in the range 2x-7x.
OCaml-TSan is still considered experimental: although tested on large projects, there is a slight chance that it introduces crashes in your code. If you encounters problems using it, please do let us know and we will do our best to improve things.
-
My program crashed when compiled on your ThreadSanitizer switch.
Try to check whether your program pushes a lot of stack frames on the stack. A call tree of depth >64k can cause a buffer overflow in the ThreadSanitizer runtime. This is a ThreadSanitizer limitation that we are trying to work around. -
TSan warns me about this data race, but it’s benign / I don’t want to deal with it for now.
You can use a suppressions file to silence TSan warnings.
-
Google Sanitizer wiki:
-
TSan C/C++ Manual: https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual
-
TSan Algorithm: https://github.com/google/sanitizers/wiki/ThreadSanitizerAlgorithm
-
-
A good talk about TSan’s internals: ""go test -race" Under the Hood" by Kavya Joshi
-
More information about the project status and technical details: https://github.com/ocaml-multicore/ocaml-tsan/wiki/Status-of-ThreadSanitizer-for-OCaml