-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch dry-run apply validation #781
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approach looks reasonable to me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is a lot less code than I expected, a few questions / style thoughts added
@@ -128,17 +132,15 @@ def deploy_resources(resources, prune: false, verify:, record_summary: true) | |||
end | |||
end | |||
|
|||
def apply_all(resources, prune) | |||
def apply_all(resources, prune, dry_run: false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens for resources that use create
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are simply included in the apply_all
call. From what I observed, it works as expected, and the actual dry-run=false
invocation will use the appropriate deploy method. This seemed like the best way to capture all resources in batch validation. One thing this won't catch is resources that exclusively use :create
(without, e.g. a fallback to :replace
), since that will pass validation but fail on deploy if the resource already exists in cluster. IMO that's a bug in their config, not Krane, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How hard is it to special case this e.g. do the batch and then dry run the create only resources too? This seems like a nice to catch (though we'd only catch that today for secrets).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #787 to keep track of this refinement.
8255286
to
ec9ae57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 style thing and one question, everything else lgtm
output_is_sensitive = resources.any?(&:sensitive_template_content?) | ||
global_mode = resources.all?(&:global?) | ||
out, err, st = kubectl.run(*command, log_failure: false, output_is_sensitive: output_is_sensitive, | ||
attempts: 2, use_namespace: !global_mode) | ||
|
||
tags = statsd_tags + (dry_run ? ['dry_run:true'] : ['dry_run:false']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're going to need to remember to update dashboards
@@ -128,17 +132,15 @@ def deploy_resources(resources, prune: false, verify:, record_summary: true) | |||
end | |||
end | |||
|
|||
def apply_all(resources, prune) | |||
def apply_all(resources, prune, dry_run: false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How hard is it to special case this e.g. do the batch and then dry run the create only resources too? This seems like a nice to catch (though we'd only catch that today for secrets).
What are you trying to accomplish with this PR?
Perf testing 1.18 highlighted the fact that running
kubectl
per-resource (as we do when running dry-run for resource validation in Phase 1) is markedly slower than simply batching all resources and dry-run applying en masse. To that end, this PR tweaks the resource validation ofDeployTask
by running kubectl validation on all resources at once.How is this accomplished?
Add a
dry_run
flag toResourceDeployer#apply_all
to hook into available logic and leverage it inDeployTask
(this PR is only limited to that task). In the event the batch apply fails, we fall back to previous behaviour of checking resource-by-resource. I tried using only the batch apply, and while it does do a good job of suppressing sensitive information (except in a small number of test cases), it is too much of a yak-shave to fix all the tests for the differences in log-matching used by the tests. If, over time, we want to move forward to exclusive batch applying, we can do a dedicated PR.What could go wrong
The only real danger would be batch apply is successful but there's an error that the resource-by-resource apply would catch; this doesn't seem possible, though. In the case of failure, we simply fallback to current behaviour so there's no risk there, either.
Would appreciate some feedback on the tests I had to write, since they are serial (by virtue of containing expectations). This also led me to think that we should separate out unit/serial/parallel tests into separate (parallel) buildkite steps.