-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jk/in parallel #3
Conversation
I prefer group(parallel: true, threads: 10) do
group(serial: true) do
end
end Also, is there value in this approach, or does it make more sense to have remote_file '/var/foo' do
source '...'
async true
notifies :restart, 'service[bar]'
end I think people are also more familiar with the word "async". |
I like the
|
@jkeiser they are different use cases (I think): template 'foo'
group(parallel: true) do
thing_1
thing_2
end
template 'bar' In my mind, this says, create the foo template, run (thing_1 and thing_2) in parallel, create the bar template. Whereas: template 'foo'
thing_1 do
async true
end
thing_2 do
async true
end
template 'bar' Says, create the foo template, schedule thing_1 and your earliest connivence, schedule thing_2 and your earliest connivence, create the bar template. |
So we're clear, under an template 'foo'
thing_1 do
async true
end
template 'bar'
thing_2 do
async true
end |
Depends on why you mean by "parallel". If you mean "at the same exact time", no. If you mean "in a non-blocking thread", yes. |
It makes more sense with remote_file '/var/file' do
source 'really-really-big-file'
async true
notifies :restart, 'service[foo]', :immediately
end
package 'foo'
package 'bar' This will queue the remote file for download, but it won't wait for the download to complete. That runs in a background thread and then executes notifications upon completion. |
OK, now I see what you mean by async. They are indeed different use cases. There is a use for parallel_group (sometimes you want to create your three database servers before you create any web servers), and I can see uses for async too. I'll leave async for a later time, but I think it's a great idea. |
Dotting down a couple of the thoughts we chatted about during chefconf, to see if any resonate: a) support "m out of n" semantics. sample use case: provision a cluster of 30 nodes, but accept a first go where only 20 are successful ( and possibly converge in subsequent runs towards the desired 30). b) allow for compensating activities. use case: if you provision 10 nodes, but wanted 30, define a block that will handle whatever partial successes have already been performed. [1] ( [1] Compensating transactions: https://en.wikipedia.org/wiki/Compensating_transaction |
Per conversation, I'm 👍 on some kind of parallelization, because I'd really like my recipes to execute in parallel when possible. |
@jkeiser - this RFC still relevant? I think I'm +1 on both this and on async. |
Just realized this needs to be in a community meeting to merge @adamhjk |
I was excited to find and read this RFC. Currently remote_file resources take up the bulk of the time for my runs and making them run in parallel would be a huge speedup. The specific use case I thought about was using async on my remote_file resources so they would start downloading at the start of the run. When the chef-client gets to a remote_file resource in a recipe it checks if that resource has completed. If it has completed the chef-client continues, if it has not then the chef-client halts and waits for the resource to complete. This would parallelize all remote_file resources regardless of which recipe they are in and would not break dependencies on those resources. From my understanding the current proposed version of async would start the resource when it comes to it in the recipe and then the chef-client would continue on to the next resource. The only way to depend on the async resource would be to have a notifies, which in my use cases, would not work since I need the dependent resource to run every time. The group format looks useful but only for resources in the same recipe. Generally I only have one remote_file per recipe which would prevent me from taking advantage of the group format. |
@Tech356 That's my interpretation as well - however I think that's a necessary constraint. There are plenty of legitimate reasons you couldn't start immediately (perhaps download directory is created via recipe or package install partway through the chef run, or the user that will own the file) - you can't expect a set of assumptions to be figured out magically for you, you need to optimise explicitly. I would have suggested running the remote_file during compile time to get them to trigger early, while still keeping the logic with relevant cookbooks, but how would running the remote_file resources at compile time work? is parallelism or async still possible at that point? |
@jeremyolliver I agree that there are constraints that are necessary, but in my case the remote files only depend on node attributes which are resolved during compile. I was thinking more along the lines of adding a couple options to
The last one is like what is currently proposed. The first one is like what I am looking for. Maybe there are other options that might be useful to others. |
👍 |
I won't be here until next week :) |
This PR is currently on the agenda for our next IRC developers' meeting. Please let me know if it gets merged or otherwise closed before then so that the agenda can be updated. Thanks! |
@jkeiser can you please add the appropriate copyright notice to this RFC before our meeting tomorrow?
|
👍 |
Once @jkeiser has added a clarification that "future things" will require a new RFC, this is approved for merge @chef/rfc-editors |
Squashed and merged in 5e13978. This was accepted as RFC044. |
# This is the 1st commit message: This commit proposes an RFC to replace the existing RFC-075 (Multiple Policy Files and Teams) Signed-off-by: Jon Cowie <jcowie@chef.io> # This is the commit message #2: More specification details added Signed-off-by: Jon Cowie <jcowie@chef.io> # This is the commit message #3: Add more specification details and problems section Signed-off-by: Jon Cowie <jonlives@gmail.com> # This is the commit message #4: Add path parameter to git source Signed-off-by: Jon Cowie <jonlives@gmail.com>
No description provided.