Skip to content

Commit

Permalink
Try to close race condition in FileWatching tests (#38407) (#39978)
Browse files Browse the repository at this point in the history
* Try to close race condition in FreeBSD tests

We're seeing frequent test failures in the FileWatching test on
FreeBSD. Here's my theory of what happens:
 - Both the timer callback and the poll callback execute on the same libuv loop
 - They each schedule their respective tasks
 - Whichever task gets scheduled first first determines the result

However, in the test, we expect that, if the poll callback ran, (which we know
because we know there was an event pending), then that result does
actually get delivered to the toplevel task. This PR tries to close
this hole by adding the following condition:

If the task is no longer waiting on the file watcher (because libuv already
scheduled it), then wait for the task to run to completion, independent
of any timeout. I believe this should close the above race condition
and hopefully fix the test.

* Add another super-short timeout to try to trigger the same-tick issue

(cherry picked from commit 9a8a675)

Co-authored-by: Keno Fischer <keno@juliacomputing.com>
  • Loading branch information
2 people authored and staticfloat committed Dec 22, 2022
1 parent 08b711d commit e2690a0
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 5 deletions.
18 changes: 14 additions & 4 deletions stdlib/FileWatching/src/FileWatching.jl
Original file line number Diff line number Diff line change
Expand Up @@ -254,6 +254,10 @@ mutable struct _FDWatcher
end
end

function iswaiting(fwd::_FDWatcher, t::Task)
return fwd.notify.waitq === t.queue
end

mutable struct FDWatcher
watcher::_FDWatcher
readable::Bool
Expand Down Expand Up @@ -653,10 +657,10 @@ function poll_fd(s::Union{RawFD, Sys.iswindows() ? WindowsRawSocket : Union{}},
try
if timeout_s >= 0
result::FDEvent = FDEvent()
timer = Timer(timeout_s) do t
notify(wt)
end
@async begin
t = @async begin
timer = Timer(timeout_s) do t
notify(wt)
end
try
result = wait(fdw, readable=readable, writable=writable)
catch e
Expand All @@ -666,6 +670,12 @@ function poll_fd(s::Union{RawFD, Sys.iswindows() ? WindowsRawSocket : Union{}},
notify(wt)
end
wait(wt)
# It's possible that both the timer and the poll fired on the same
# libuv loop. In that case, which event we see here first depends
# on task schedule order. If we can see that the task isn't waiting
# on the file watcher anymore, just let it finish so we can see
# the modification to `result`
iswaiting(fdw, t) || wait(t)
return result
else
return wait(fdw, readable=readable, writable=writable)
Expand Down
2 changes: 1 addition & 1 deletion stdlib/FileWatching/test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ using Base: uv_error, Experimental
# Writable ends are always tested for write-ability before a write

n = 20
intvls = [2, .2, .1, .005]
intvls = [2, .2, .1, .005, .00001]

pipe_fds = fill((Base.INVALID_OS_HANDLE, Base.INVALID_OS_HANDLE), n)
for i in 1:n
Expand Down

0 comments on commit e2690a0

Please sign in to comment.