Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch dry-run apply validation #781

Merged
merged 21 commits into from
Jan 20, 2021
Merged

Conversation

timothysmith0609
Copy link
Contributor

@timothysmith0609 timothysmith0609 commented Jan 13, 2021

What are you trying to accomplish with this PR?

Perf testing 1.18 highlighted the fact that running kubectl per-resource (as we do when running dry-run for resource validation in Phase 1) is markedly slower than simply batching all resources and dry-run applying en masse. To that end, this PR tweaks the resource validation of DeployTask by running kubectl validation on all resources at once.

How is this accomplished?

Add a dry_run flag to ResourceDeployer#apply_all to hook into available logic and leverage it in DeployTask (this PR is only limited to that task). In the event the batch apply fails, we fall back to previous behaviour of checking resource-by-resource. I tried using only the batch apply, and while it does do a good job of suppressing sensitive information (except in a small number of test cases), it is too much of a yak-shave to fix all the tests for the differences in log-matching used by the tests. If, over time, we want to move forward to exclusive batch applying, we can do a dedicated PR.

What could go wrong

  • The only real danger would be batch apply is successful but there's an error that the resource-by-resource apply would catch; this doesn't seem possible, though. In the case of failure, we simply fallback to current behaviour so there's no risk there, either.

  • Would appreciate some feedback on the tests I had to write, since they are serial (by virtue of containing expectations). This also led me to think that we should separate out unit/serial/parallel tests into separate (parallel) buildkite steps.

@timothysmith0609 timothysmith0609 marked this pull request as ready for review January 14, 2021 17:10
@timothysmith0609 timothysmith0609 requested a review from a team as a code owner January 14, 2021 17:10
Copy link
Contributor

@karanthukral karanthukral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approach looks reasonable to me :)

Copy link
Contributor

@dturn dturn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is a lot less code than I expected, a few questions / style thoughts added

lib/krane/deploy_task.rb Outdated Show resolved Hide resolved
lib/krane/deploy_task.rb Outdated Show resolved Hide resolved
@@ -128,17 +132,15 @@ def deploy_resources(resources, prune: false, verify:, record_summary: true)
end
end

def apply_all(resources, prune)
def apply_all(resources, prune, dry_run: false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens for resources that use create ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are simply included in the apply_all call. From what I observed, it works as expected, and the actual dry-run=false invocation will use the appropriate deploy method. This seemed like the best way to capture all resources in batch validation. One thing this won't catch is resources that exclusively use :create (without, e.g. a fallback to :replace), since that will pass validation but fail on deploy if the resource already exists in cluster. IMO that's a bug in their config, not Krane, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hard is it to special case this e.g. do the batch and then dry run the create only resources too? This seems like a nice to catch (though we'd only catch that today for secrets).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #787 to keep track of this refinement.

lib/krane/deploy_task.rb Outdated Show resolved Hide resolved
lib/krane/resource_deployer.rb Show resolved Hide resolved
Copy link
Contributor

@dturn dturn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 style thing and one question, everything else lgtm

lib/krane/deploy_task.rb Outdated Show resolved Hide resolved
output_is_sensitive = resources.any?(&:sensitive_template_content?)
global_mode = resources.all?(&:global?)
out, err, st = kubectl.run(*command, log_failure: false, output_is_sensitive: output_is_sensitive,
attempts: 2, use_namespace: !global_mode)

tags = statsd_tags + (dry_run ? ['dry_run:true'] : ['dry_run:false'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to need to remember to update dashboards

@@ -128,17 +132,15 @@ def deploy_resources(resources, prune: false, verify:, record_summary: true)
end
end

def apply_all(resources, prune)
def apply_all(resources, prune, dry_run: false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hard is it to special case this e.g. do the batch and then dry run the create only resources too? This seems like a nice to catch (though we'd only catch that today for secrets).

@timothysmith0609 timothysmith0609 merged commit 4e9bbb6 into master Jan 20, 2021
@timothysmith0609 timothysmith0609 deleted the batch_apply_validations branch January 20, 2021 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants