-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
idleworker() function. #14736
idleworker() function. #14736
Conversation
I'm not sure how to proceed with writing test cases for this. |
see edit: sorry, I could've sworn we used to have it set up so parallel.jl waited until other test workers were done then ran at the end, but it looks like that might not be the case any more. hard to tell |
It looks like |
2570022
to
b1e2d84
Compare
b1e2d84
to
c6ba369
Compare
CI passing now. There may well be a more elegant way to implement Any comments on the interface before I continue to move in the direction described here : #12943 (comment) @StefanKarpinski, @JeffBezanson, @jakebolewski, @amitmurthy ? |
|
||
w = workers() | ||
@test length(w) == 3 | ||
t1 = now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format here is very weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yuyichao, I apologise if this looks weird to you.
I found that putting the temporal comments and assertions into a seperate column made it easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow the code formatting guideline.
|
Hi @amitmurthy, Thanks for the feedback. Re: 1. I suspect that with all-to-all communication there is no way to identify an idle worker. There would always be a race condition where some other node makes the "idle" node busy right after it is identified as idle. What if this instead of being called Re: 2. I suppose that it wouldn't be too hard to have a flag to keep track of which workers are busy with locally originated asynchronous I wonder wether it might not be better to encourage use of the blocking In an case, what I'm aiming for here is a function to call when I want to ask for "a worker that I haven't already given a job to." I'm open to suggestions about what the function should be called. |
@amitmurthy, another thing to note is that this PR does not export Perhaps we can leave the design of a public "get me a worker" API for later. |
Wouldn't it be simpler to run off a Q at an application level? A contrived example:
|
Hi @amitmurthy, using a shared queue to keep track of available workers makes a lot of sense. What to you think of having a built-in default worker queue something like this... type WorkerPool
channel::RemoteChannel{Channel{Int}}
end
function WorkerPool(workers::Vector{Int})
# Create a shared queue of workers...
pool = RemoteChannel(()->Channel{Int}(128))
# Check that workers are not already part of a pool...
check = () -> if :_worker_pool in names(Main)
error("Worker $(myid()) already in a WorkerPool!")
end
foreach(fetch, [@spawnat w check() for w in workers])
# Put each worker into the pool...
for w in workers
put!(pool, w)
@spawnat w global _worker_pool = pool
end
WorkerPool(pool)
end
WorkerPool(n::Integer) = WorkerPool(addprocs(n))
WorkerPool() = WorkerPool(addprocs())
Base.take!(pool::WorkerPool) = take!(pool.channel)
function Base.remotecall_fetch(f, pool::WorkerPool, args...)
l = (args...)->try f(args...) finally put!(_worker_pool, myid()) end
remotecall_fetch(l, take!(pool), args...)
end
default_worker_pool() = _default_worker_pool
global _default_worker_pool = WorkerPool(workers())
function Base.remotecall_fetch(f, args...)
remotecall_fetch(f, default_worker_pool(), args...)
end |
A worker pool will definitely be useful. I suspect there will be some debate about whether to include it in Base or have it as part of an external package. Since we do not yet have a "standard library" or "standard packages", I am OK with having it in Base for now. |
superseded by #15073 |
idleworker()
returns a worker that is not currently busy (not being waited for byremotecall_fetch
orremotecall_wait
).If there are no idle workers,
idleworker()
blocks until a worker is available.This is intended as a more general interface for the type of dynamic scheduling provided by
pmap
.e.g.
pmap
might eventually be a simple combination ofamap()
andidleworker()
as described in #14843...Implementation notes:
RemoteValue.waitingfor
field to identify busy workers.remotecall_fetch
andremotecall_wait
to setrv.waitingfor
back to 0 after waiting.ProcessGroup.worker_is_idle::Condition
to enable waiting for multiple busy workers.idleworker()
waits onworker_is_idle
.remotecall_fetch
andremotecall_wait
notifyworker_is_idle
just after settingrv.waitingfor = 0
.