Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

state: Introduce DependsOn for N-to-1 job dependencies #1021

Merged
merged 5 commits into from
Aug 4, 2022

Conversation

radeksimko
Copy link
Member

@radeksimko radeksimko commented Aug 1, 2022

Background

Prior to this PR, the only way we could chain jobs was via Defer. This had one unfortunate limit, which is that a job can only depend on exactly one other job, i.e. there was just 1-to-1 dependency.

As part of introducing a new job type for parsing provider versions in #1014 it became clear that the new job will need to rely on 2+ other jobs:

  • parsed files
  • decoded metadata (via earlydecoder) - so that we have required_providers
  • parsed module manifest (+ any/all modules decoded) - so that we have required_providers of any submodules

I originally attempted to solve it in a hacky way by introducing WaitForJobs() inside Defer, but that created a few other problems - need to run at least two go routines, so that one can be blocked (waiting) and the other one can be dispatching + I ran into some race conditions, so I binned that hacky approach.

A nice side-effect demonstrated in the 2nd commit is that using DependsOn instead of Defer makes the code (IMHO) clearer and more readable, as there's less nesting and with the right variable names for job IDs it makes it much more obvious what the relationships are.

There is unfortunately still a few remaining (valid) use cases for Defer though - specifically because we cannot schedule any jobs for module paths parsed from a module manifest, until that manifest is parsed - so those few lines of code to do the scheduling still need to run as part of Defer.

The only downside of DependsOn compared to Defer is that the job which we depend on may be long gone/finished by the time the dependent one is dispatched, which means that we don't have direct access to its error/outcome. However I plan to address that problem more holistically as part of #1006 - TL;DR we can still run the jobs and return early if we find out that the data which we were expecting the previous job to provide aren't there.

@radeksimko radeksimko added the enhancement New feature or request label Aug 1, 2022
@radeksimko radeksimko self-assigned this Aug 1, 2022
@radeksimko radeksimko force-pushed the f-jobs-depends-on branch 2 times, most recently from acf7f67 to cb47858 Compare August 1, 2022 19:44
@radeksimko radeksimko added this to the v0.29.0 milestone Aug 1, 2022
@radeksimko radeksimko marked this pull request as ready for review August 1, 2022 19:58
@radeksimko radeksimko requested a review from a team as a code owner August 1, 2022 19:58
Copy link
Member

@dbanck dbanck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the reduced nesting when enqueuing jobs!

I think, I've found two things which need further inspection

internal/state/jobs.go Outdated Show resolved Hide resolved
internal/indexer/walker.go Show resolved Hide resolved
@radeksimko radeksimko requested a review from dbanck August 3, 2022 14:53
Copy link
Member

@dbanck dbanck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing/addressing my concerns!

I'm still a bit unsure about the order/dependencies of some jobs.

internal/indexer/module_calls.go Show resolved Hide resolved
internal/indexer/module_calls.go Outdated Show resolved Hide resolved
internal/indexer/walker.go Show resolved Hide resolved

id, err = idx.jobStore.EnqueueJob(job.Job{
if parseId != "" {
ids, err := idx.collectReferences(mcHandle, refCollectionDeps)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice this before: why are we calling collectReferences here, before parsing variables? While in the walker, we're calling it much later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ordering of scheduling is not significant - the *.tfvars parsing and *.tf references should be independent of each other so it doesn't matter in what order do they run - we just happen to schedule them in different order in the two places, DependsOn and Defer (if any) decide in what order they actually get executed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay! Thanks for the explanation!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could align the ordering of scheduling, but there are also other differences - e.g. that when we parse submodules here, we assume that the modules are not initialised by themselves (given the context) and don't attempt to parse module manifest or obtain schema - both of which would otherwise be two more jobs which reference collection should depend on.

Co-authored-by: Daniel Banck <dbanck@users.noreply.github.com>
@radeksimko radeksimko merged commit 48ab782 into main Aug 4, 2022
@radeksimko radeksimko deleted the f-jobs-depends-on branch August 4, 2022 06:40
@github-actions
Copy link

github-actions bot commented Sep 4, 2022

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants