Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(stdune): fix TSAN warning in wait4_stubs #10554

Merged
merged 2 commits into from
May 23, 2024
Merged

Conversation

emillon
Copy link
Collaborator

@emillon emillon commented May 21, 2024

Fixes #10553

Quoting @jmid, using a local variable without the runtime lock in place, is against the rules. For integer values, sometimes the rules are bent, but this is not a good idea. See ocaml/ocaml#12737.

Fixes ocaml#10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <me@emillon.org>
Copy link

@OlivierNicole OlivierNicole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect this to remove the TSan warning.

@emillon
Copy link
Collaborator Author

emillon commented May 21, 2024

Yes, I'm double checking it does indeed.

@emillon
Copy link
Collaborator Author

emillon commented May 21, 2024

It's displaying another warning:

WARNING: ThreadSanitizer: data race (pid=310426)
  Atomic write of size 8 at 0x7fff7f467ce0 by thread T4 (mutexes: write M0, write M1):
    #0 oldify_one runtime/minor_gc.c:250 (dune.exe+0xe7153d) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #1 caml_do_local_roots runtime/roots.c:64 (dune.exe+0xe767f6) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #2 caml_thread_scan_roots /tmp/tsan/_opam/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/st_stubs.c:179 (dune.exe+0xe21bad) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #3 caml_empty_minor_heap_promote runtime/minor_gc.c:615 (dune.exe+0xe72861) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #4 caml_stw_empty_minor_heap_no_major_slice runtime/minor_gc.c:746 (dune.exe+0xe72c9d) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #5 caml_stw_empty_minor_heap runtime/minor_gc.c:777 (dune.exe+0xe72f72) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #6 caml_try_run_on_all_domains_with_spin_work runtime/domain.c:1576 (dune.exe+0xe4dedd) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #7 caml_try_empty_minor_heap_on_all_domains runtime/minor_gc.c:808 (dune.exe+0xe730d1) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #8 caml_empty_minor_heaps_once runtime/minor_gc.c:829 (dune.exe+0xe730d1)
    #9 caml_poll_gc_work runtime/domain.c:1745 (dune.exe+0xe4d760) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #10 caml_handle_gc_interrupt runtime/domain.c:1778 (dune.exe+0xe4e595) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #11 caml_do_pending_actions_exn runtime/signals.c:341 (dune.exe+0xe7c423) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #12 caml_process_pending_actions_with_root_exn runtime/signals.c:386 (dune.exe+0xe7c423)
    #13 caml_process_pending_actions_with_root runtime/signals.c:395 (dune.exe+0xe7c537) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #14 caml_process_pending_actions runtime/signals.c:406 (dune.exe+0xe7c537)
    #15 caml_garbage_collection runtime/signals_nat.c:71 (dune.exe+0xe8996c) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #16 caml_call_gc <null> (dune.exe+0xe87551) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #17 camlDune_engine__Scheduler.run_2600 src/dune_engine/scheduler.ml:592 (dune.exe+0xbc5502) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #18 camlDune_engine__Scheduler.f_1696 src/dune_engine/scheduler.ml:97 (dune.exe+0xbc262d) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #19 camlThread.fun_768 /tmp/tsan/_opam/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/thread.ml:48 (dune.exe+0xd4d800) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #20 caml_start_program <null> (dune.exe+0xe878c3) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #21 caml_callback_exn runtime/callback.c:201 (dune.exe+0xe48b83) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #22 caml_thread_start /tmp/tsan/_opam/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/st_stubs.c:634 (dune.exe+0xe23547) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)

  Previous read of size 8 at 0x7fff7f467ce0 by main thread:
    #0 caml_unix_mkdir /tmp/tsan/_opam/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/unix/mkdir.c:37 (dune.exe+0xe2461f) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #1 caml_c_call <null> (dune.exe+0xe877a7) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #2 camlStdune__Fpath.mkdir_inner_1135 otherlibs/stdune/src/fpath.ml:19 (dune.exe+0xce91e2) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #3 camlStdune__Fpath.mkdir_p_772 otherlibs/stdune/src/fpath.ml:31 (dune.exe+0xce9353) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #4 camlStdune__Path.mkdir_p_5475 otherlibs/stdune/src/path.ml:598 (dune.exe+0xcd015d) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #5 camlDune_engine__Build_system.maybe_async_rule_file_op_3529 src/dune_engine/build_system.ml:324 (dune.exe+0xb2162d) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #6 camlDune_engine__Build_system.f_6677 src/dune_engine/build_system.ml:500 (dune.exe+0xb23b18) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #7 caml_apply2 <null> (dune.exe+0x5d9317) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #8 camlFiber__Scheduler.advance_1194 vendor/fiber/src/scheduler.ml:194 (dune.exe+0xc5cdf8) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #9 camlFiber.loop_393 vendor/fiber/src/fiber.ml:15 (dune.exe+0xc5261b) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #10 camlDune_engine__Scheduler.run_3178 vendor/fiber/src/fiber.ml:17 (dune.exe+0xbc7fd6) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #11 caml_startup_common runtime/startup_nat.c:132 (dune.exe+0xe87070) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #12 caml_startup_common runtime/startup_nat.c:88 (dune.exe+0xe87070)
    #13 caml_startup_exn runtime/startup_nat.c:139 (dune.exe+0xe87117) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #14 caml_startup runtime/startup_nat.c:144 (dune.exe+0xe87117)
    #15 caml_main runtime/startup_nat.c:151 (dune.exe+0xe87117)
    #16 main runtime/main.c:37 (dune.exe+0x5cdbe5) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)

  Location is stack of main thread.

  Location is global '<null>' at 0x000000000000 ([stack]+0x1ece0)

  Mutex M0 (0x720c000003f0) created at:
    #0 pthread_mutex_init ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1315 (libtsan.so.2+0x550bf) (BuildId: 4696c58f898e5d5cf4fe08e45868ac4fc5702473)
    #1 sync_mutex_create runtime/sync_posix.h:46 (dune.exe+0xe7eeba) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #2 caml_ml_mutex_new runtime/sync.c:77 (dune.exe+0xe7eeba)
    #3 caml_c_call <null> (dune.exe+0xe877a7) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #4 camlDune_engine__Scheduler.init_2603 src/dune_engine/scheduler.ml:598 (dune.exe+0xbc55fb) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #5 camlDune_engine__Scheduler.prepare_3124 src/dune_engine/scheduler.ml:934 (dune.exe+0xbc7728) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #6 camlDune_engine__Scheduler.go_inner_5417 src/dune_engine/scheduler.ml:1249 (dune.exe+0xbca8c8) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #7 camlCmdliner_term.fun_639 vendor/cmdliner/src/cmdliner_term.ml:24 (dune.exe+0xd31242) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #8 camlCmdliner_eval.run_parser_576 vendor/cmdliner/src/cmdliner_eval.ml:34 (dune.exe+0xd28d05) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #9 camlCmdliner_eval.eval_value_inner_1700 vendor/cmdliner/src/cmdliner_eval.ml:202 (dune.exe+0xd2a584) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #10 camlMain.entry bin/main.ml:106 (dune.exe+0x5d9b29) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #11 caml_program <null> (dune.exe+0x5d15b9) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #12 caml_start_program <null> (dune.exe+0xe878c3) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #13 caml_startup_common runtime/startup_nat.c:132 (dune.exe+0xe87070) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #14 caml_startup_common runtime/startup_nat.c:88 (dune.exe+0xe87070)
    #15 caml_startup_exn runtime/startup_nat.c:139 (dune.exe+0xe87117) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #16 caml_startup runtime/startup_nat.c:144 (dune.exe+0xe87117)
    #17 caml_main runtime/startup_nat.c:151 (dune.exe+0xe87117)
    #18 main runtime/main.c:37 (dune.exe+0x5cdbe5) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)

  Mutex M1 (0x561dd37603c8) created at:
    #0 pthread_mutex_init ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1315 (libtsan.so.2+0x550bf) (BuildId: 4696c58f898e5d5cf4fe08e45868ac4fc5702473)
    #1 caml_plat_mutex_init runtime/platform.c:57 (dune.exe+0xe75aa2) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #2 caml_init_domains runtime/domain.c:943 (dune.exe+0xe4ca7e) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #3 caml_init_gc runtime/gc_ctrl.c:353 (dune.exe+0xe5aa5f) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #4 caml_startup_common runtime/startup_nat.c:111 (dune.exe+0xe86f5f) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #5 caml_startup_common runtime/startup_nat.c:88 (dune.exe+0xe86f5f)
    #6 caml_startup_exn runtime/startup_nat.c:139 (dune.exe+0xe87117) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #7 caml_startup runtime/startup_nat.c:144 (dune.exe+0xe87117)
    #8 caml_main runtime/startup_nat.c:151 (dune.exe+0xe87117)
    #9 main runtime/main.c:37 (dune.exe+0x5cdbe5) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)

  Thread T4 (tid=310432, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1022 (libtsan.so.2+0x564a6) (BuildId: 4696c58f898e5d5cf4fe08e45868ac4fc5702473)
    #1 st_thread_create /tmp/tsan/_opam/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/st_pthreads.h:57 (dune.exe+0xe236b5) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #2 caml_thread_new /tmp/tsan/_opam/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/st_stubs.c:685 (dune.exe+0xe236b5)
    #3 caml_c_call <null> (dune.exe+0xe877a7) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #4 camlThread.create_288 /tmp/tsan/_opam/.opam-switch/build/ocaml-variants.5.2.0+options/otherlibs/systhreads/thread.ml:45 (dune.exe+0xd4d779) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #5 camlDune_engine__Scheduler.spawn_1693 src/dune_engine/scheduler.ml:102 (dune.exe+0xbc2595) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #6 camlDune_engine__Scheduler.init_2603 src/dune_engine/scheduler.ml:605 (dune.exe+0xbc5680) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #7 camlDune_engine__Scheduler.prepare_3124 src/dune_engine/scheduler.ml:934 (dune.exe+0xbc7728) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #8 camlDune_engine__Scheduler.go_inner_5417 src/dune_engine/scheduler.ml:1249 (dune.exe+0xbca8c8) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #9 camlCmdliner_term.fun_639 vendor/cmdliner/src/cmdliner_term.ml:24 (dune.exe+0xd31242) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #10 camlCmdliner_eval.run_parser_576 vendor/cmdliner/src/cmdliner_eval.ml:34 (dune.exe+0xd28d05) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #11 camlCmdliner_eval.eval_value_inner_1700 vendor/cmdliner/src/cmdliner_eval.ml:202 (dune.exe+0xd2a584) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #12 camlMain.entry bin/main.ml:106 (dune.exe+0x5d9b29) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #13 caml_program <null> (dune.exe+0x5d15b9) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #14 caml_start_program <null> (dune.exe+0xe878c3) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #15 caml_startup_common runtime/startup_nat.c:132 (dune.exe+0xe87070) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #16 caml_startup_common runtime/startup_nat.c:88 (dune.exe+0xe87070)
    #17 caml_startup_exn runtime/startup_nat.c:139 (dune.exe+0xe87117) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)
    #18 caml_startup runtime/startup_nat.c:144 (dune.exe+0xe87117)
    #19 caml_main runtime/startup_nat.c:151 (dune.exe+0xe87117)
    #20 main runtime/main.c:37 (dune.exe+0x5cdbe5) (BuildId: 690605e4ac77a02bdcda3b1ee7b57cd7f73853b2)

SUMMARY: ThreadSanitizer: data race runtime/minor_gc.c:250 in oldify_one
==================
ThreadSanitizer: reported 1 warnings                
make: *** [Makefile:55: dev] Error 66

This seems to come from ocaml itself:

https://github.com/ocaml/ocaml/blob/5.2.0/otherlibs/unix/mkdir.c#L36-L38

@jmid
Copy link
Contributor

jmid commented May 21, 2024

Good catch! I initially found one in Sys.mkdir: https://github.com/ocaml/ocaml/blob/trunk/runtime/sys.c#L376

emillon added a commit to emillon/ocaml that referenced this pull request May 22, 2024
As discussed in ocaml#12737, using `Int_val` inside blocking sections can
cause data races and is now seen as a bad idea.

(this causes a TSAN warning when using Dune, see ocaml/dune#10554)
@emillon
Copy link
Collaborator Author

emillon commented May 22, 2024

See ocaml/ocaml#13188

emillon added a commit to emillon/ocaml that referenced this pull request May 22, 2024
As discussed in ocaml#12737, using `Int_val` inside blocking sections can
cause data races and is now seen as a bad idea.

(this causes a TSAN warning when using Dune, see ocaml/dune#10554)
@emillon emillon merged commit f998110 into ocaml:main May 23, 2024
25 of 28 checks passed
@emillon emillon deleted the tsan-wait4-pid branch May 23, 2024 09:19
@emillon emillon mentioned this pull request May 24, 2024
13 tasks
emillon added a commit to emillon/dune that referenced this pull request May 24, 2024
Fixes ocaml#10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit to emillon/dune that referenced this pull request May 24, 2024
Fixes ocaml#10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit that referenced this pull request May 24, 2024
Fixes #10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit to emillon/opam-repository that referenced this pull request May 24, 2024
CHANGES:

### Fixed

- Fix interpretation of `exists_if` predicate in `META` files of installed
  libraries containing more than one element. (ocaml/dune#10564, fixes ocaml/dune#10563, @dbuenzli,
  @nojb)

- Fix TSAN warning in wait4 stubs (ocaml/dune#10554, fixes ocaml/dune#10553, @emillon)
emillon added a commit to emillon/ocaml that referenced this pull request May 29, 2024
As discussed in ocaml#12737, using `Int_val` inside blocking sections can
cause data races and is now seen as a bad idea.

(this causes a TSAN warning when using Dune, see ocaml/dune#10554)
MA0010 pushed a commit to MA0010/dune that referenced this pull request Jun 5, 2024
Fixes ocaml#10553

Quoting @jmid, using a local variable without the runtime lock in place,
is against the rules. For integer values, sometimes the rules are bent,
but this is not a good idea. See ocaml/ocaml#12737.

Signed-off-by: Etienne Millon <me@emillon.org>
avsm pushed a commit to avsm/opam-repository that referenced this pull request Sep 5, 2024
CHANGES:

### Fixed

- Fix interpretation of `exists_if` predicate in `META` files of installed
  libraries containing more than one element. (ocaml/dune#10564, fixes ocaml/dune#10563, @dbuenzli,
  @nojb)

- Fix TSAN warning in wait4 stubs (ocaml/dune#10554, fixes ocaml/dune#10553, @emillon)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failure to build with TSan
3 participants