MTL-1819 Resolve Failed Wait Condition #40
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary and Scope
Issue Type
Currently, netboots using
metal.no-wipe=1
may or may not succeed. This PR fixes a race condition where our k8s-master and k8s-worker modules never exit while waiting for the wipe function to end.metal-md-disks.sh
had an oversight, when checking metal.no-wipe=1 the/tmp/metalpave.done
file was never created. This prevented themetal_paved
function from correctly reporting whether the pave had actually finished or had been skipped. This failure causes the dependent modules running on k8s-masters and k8s-workers to never exit properly whenmetal.no-wipe=1
was set on netboots.Lastly there are a few minor tweaks in this commit:
-gt
instead of-ge
.metal.wipe-delay
kernel argument for changing the delay-time from 5 seconds to anything between 2 and 60 seconds./tmp/metalpave.done
into a global variable set inmetal-lib.sh
, making it less error prone to typo mistakes.Prerequisites
Idempotency
Risks and Mitigations
This removes risk of dependent dracut modules from running indefinitely, causing the boot to stall.