-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
state: Introduce DependsOn
for N-to-1 job dependencies
#1021
Conversation
acf7f67
to
cb47858
Compare
cb47858
to
458dd54
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the reduced nesting when enqueuing jobs!
I think, I've found two things which need further inspection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing/addressing my concerns!
I'm still a bit unsure about the order/dependencies of some jobs.
|
||
id, err = idx.jobStore.EnqueueJob(job.Job{ | ||
if parseId != "" { | ||
ids, err := idx.collectReferences(mcHandle, refCollectionDeps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't notice this before: why are we calling collectReferences
here, before parsing variables? While in the walker, we're calling it much later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ordering of scheduling is not significant - the *.tfvars
parsing and *.tf
references should be independent of each other so it doesn't matter in what order do they run - we just happen to schedule them in different order in the two places, DependsOn
and Defer
(if any) decide in what order they actually get executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay! Thanks for the explanation!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could align the ordering of scheduling, but there are also other differences - e.g. that when we parse submodules here, we assume that the modules are not initialised by themselves (given the context) and don't attempt to parse module manifest or obtain schema - both of which would otherwise be two more jobs which reference collection should depend on.
Co-authored-by: Daniel Banck <dbanck@users.noreply.github.com>
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Background
Prior to this PR, the only way we could chain jobs was via
Defer
. This had one unfortunate limit, which is that a job can only depend on exactly one other job, i.e. there was just 1-to-1 dependency.As part of introducing a new job type for parsing provider versions in #1014 it became clear that the new job will need to rely on 2+ other jobs:
required_providers
required_providers
of any submodulesI originally attempted to solve it in a hacky way by introducing
WaitForJobs()
insideDefer
, but that created a few other problems - need to run at least two go routines, so that one can be blocked (waiting) and the other one can be dispatching + I ran into some race conditions, so I binned that hacky approach.A nice side-effect demonstrated in the 2nd commit is that using
DependsOn
instead ofDefer
makes the code (IMHO) clearer and more readable, as there's less nesting and with the right variable names for job IDs it makes it much more obvious what the relationships are.There is unfortunately still a few remaining (valid) use cases for
Defer
though - specifically because we cannot schedule any jobs for module paths parsed from a module manifest, until that manifest is parsed - so those few lines of code to do the scheduling still need to run as part ofDefer
.The only downside of
DependsOn
compared toDefer
is that the job which we depend on may be long gone/finished by the time the dependent one is dispatched, which means that we don't have direct access to its error/outcome. However I plan to address that problem more holistically as part of #1006 - TL;DR we can still run the jobs and return early if we find out that the data which we were expecting the previous job to provide aren't there.