Pruneable CRs #312

dturn · 2018-07-06T20:40:33Z

Follow-up to #306 to support prunable CRDs.

Thoughts / questions:

Should custom_resource_definitions be a class method on CRD?
prunable? checks the labels as well as the annotations because the annotation wasn't sticking between deploys

dturn · 2018-07-09T22:50:28Z

lib/kubernetes-deploy/deploy_task.rb

+            definition: r_def, statsd_tags: @namespace_tags)
+
+          unless KubernetesDeploy.const_defined?(crd.kind)
+            KubernetesDeploy.const_set crd.kind, Class.new(KubernetesDeploy::CustomResource)


Can we avoid making classes? This could cause problems when used as a library against two different clusters.

Also this functions should be moved to its own class.

dturn · 2018-07-09T22:52:56Z

test/fixtures/resource-discovery/widgets.yml

+  name: widgets.api.foobar.com
+  annotations:
+    kubernetes-deploy.shopify.io/metadata:
+      prunable: "true"


Is this malformed? (Is that why I needed the label)

stefanmb

This approach looks good. I think not querying the API to obtain all resources is reasonable, though it limits us to expressing deploy dependencies only between explicitly modelled resources (i.e. ones that correspond to classes we defined).

If it were up to me I would keep the DAG representation of the the deploy list, I think it's more conceptually correct and a better way to represent the problem than a linearly walking through the list. The list may be sufficient for our immediate purposes, but it seems like a gross simplification to only save a bit of code.

stefanmb · 2018-07-12T19:48:56Z

lib/kubernetes-deploy/deploy_task.rb

@@ -27,6 +27,8 @@
  bucket
  stateful_set
  cron_job
+  custom_resource_definition
+  custom_resource


In the initial PR (https://github.com/Shopify/kubernetes-deploy/pull/188/files) this static list was removed in favour of a DAG representation. Do we plan to reintroduce the graph based representation?

Some advantages of the DAG:

Only expresses ordering between dependencies that actually matter (e.g. multiple solutions to topological sort are valid)

Can express dependencies between custom resources, a list that considers all CRs of equal deploy priority cannot.

Are you thinking of the predeploy sequence? I agree that it would be awesome to use a DAG, but this PR doesn't touch the predeploy list. (now anyway... maybe a past version I didn't look at did?)

stefanmb · 2018-07-12T19:52:56Z

lib/kubernetes-deploy/resource_discovery.rb

+    end
+
+    def crds(sync_mediator)
+      sync_mediator.get_all(CustomResourceDefinition.kind).map do |r_def|


Since we are not walking the entire API tree we can only express dependencies between resources that are explicitly modelled in KubernetesDeploy (i.e. there is a class defined for them in this project) or CRDs.

dturn · 2018-07-12T21:03:53Z

If it were up to me I would keep the DAG representation of the the deploy list, I think it's more conceptually correct and a better way to represent the problem than a linearly walking through the list. The list may be sufficient for our immediate purposes, but it seems like a gross simplification to only save a bit of code.

The goal of this PR is to be the minimal set of features to make pruning work. I think the dag is enough code to justify its own PR, if/when we decide to take it on.

KnVerey · 2018-07-19T22:54:40Z

lib/kubernetes-deploy/deploy_task.rb

@@ -27,6 +27,8 @@
  bucket
  stateful_set
  cron_job
+  custom_resource_definition
+  custom_resource


Are you thinking of the predeploy sequence? I agree that it would be awesome to use a DAG, but this PR doesn't touch the predeploy list. (now anyway... maybe a past version I didn't look at did?)

KnVerey · 2018-07-19T22:56:15Z

lib/kubernetes-deploy/kubernetes_resource/custom_resource.rb

@@ -0,0 +1,14 @@
+# frozen_string_literal: true
+module KubernetesDeploy
+  class CustomResource < KubernetesResource


This isn't used, is it?

KnVerey · 2018-07-19T23:06:31Z

lib/kubernetes-deploy/kubernetes_resource/custom_resource_definition.rb

+    end
+
+    def prunable?
+      json_annotation = @definition.dig("metadata", "annotations", "kubernetes-deploy.shopify.io/metadata")


What are your thoughts (@stefanmb too) on one big annotation vs. many smaller ones? Or an inbetween option that groups related items (e.g. predeploy/dependencies, success/failure/timeout)?

I'd prefer to use smaller ones (e.x. kubernetes-deploy.shopify.io/prunable & kubernetes-deploy.shopify.io/predeploy`) that don't require us to store json strings.

KnVerey · 2018-07-19T23:12:04Z

lib/kubernetes-deploy/resource_discovery.rb

+      @context = context
+      @logger = logger
+      @namespace_tags = namespace_tags
+      @cr_mapping = {}


this seems unused

KnVerey · 2018-07-19T23:14:07Z

lib/kubernetes-deploy/resource_discovery.rb

+
+    def crds(sync_mediator)
+      sync_mediator.get_all(CustomResourceDefinition.kind).map do |r_def|
+        KubernetesResource.build(namespace: @namespace, context: @context, logger: @logger,


I'd just do CustomResourceDefinition.new directly. build is basically just a class lookup mechanism.

KnVerey · 2018-07-19T23:15:31Z

test/fixtures/hello-cloud/crd.yml

+  names:
+    kind: Mail
+    listKind: MailList
+    plural: mails


no need for the example to carry over our incorrect plural, lol 😄

KnVerey · 2018-07-19T23:18:07Z

test/fixtures/hello-cloud/crd.yml

@@ -0,0 +1,13 @@
+apiVersion: apiextensions.k8s.io/v1beta1


Adding this to hello-cloud (which strictly speaking is correct--I usually want all supported resources to be in this set) makes me want to build a real demo directory like we've been asked to, so that we don't encourage this gem to be used for globals.

KnVerey · 2018-07-19T23:35:55Z

test/integration-serial/resource_discovery_test.rb

+class ResourceDiscoveryTest < KubernetesDeploy::IntegrationTest
+  def test_non_prunable_crd
+    assert_deploy_success(deploy_fixtures("resource-discovery", subset: ["shredders.yml"]))
+    assert_deploy_success(deploy_fixtures("resource-discovery", subset: ["shredders_cr.yml"]))


Are these two separate deploys because CRDs aren't in the priority list?

No, the CRD has to be added to the cluster before the CR can pass validation. Since we do validation before we deploy even pre-deploy resources we need two deployments.

Ah, that makes sense.

KnVerey · 2018-07-19T23:36:39Z

test/integration/kubernetes_deploy_test.rb

@@ -1006,4 +1007,19 @@ def test_raise_on_yaml_missing_kind
      "  datapoint: value1"
    ], in_order: true)
  end
+
+  def test_crd_can_be_successful
+    assert_deploy_success(deploy_fixtures("hello-cloud", subset: ["crd.yml"]))


I don't think this test adds anything over the full_hello test you modified above.

It doesn't, but this file is actually from the base branch (https://github.com/Shopify/kubernetes-deploy/pull/306/files#diff-43c27b92ee93912981b34f7d8b622714R1011) where it now is a legit test. I'll fix all of this up once that PR merges.

dturn · 2018-07-31T21:54:32Z

This is ready for 👀 again.

KnVerey

Please add README docs for this to the PR :)

KnVerey · 2018-08-01T16:41:06Z

lib/kubernetes-deploy/resource_discovery.rb

+
+    def crds(sync_mediator)
+      sync_mediator.get_all(CustomResourceDefinition.kind).map do |r_def|
+        CustomResourceDefinition.build(namespace: @namespace, context: @context, logger: @logger,


why build instead of new?

KnVerey · 2018-08-01T16:44:25Z

lib/kubernetes-deploy/resource_discovery.rb

+      @namespace_tags = namespace_tags
+    end
+
+    def crds(sync_mediator)


Did you consider using KubeClient for this? I'm thinking the sync mediator doesn't buy us anything special in this case, and it adds complexity.

Are you worried about the sync mediator's cache? I'd assumed we wanted to discourage direct calls to kubeclient

Not really worried about the cache, since we clear that at the beginning of every sync cycle. We definitely want to discourage direct calls to kubeclient from resource classes--they should only be making calls during the sync cycle, and therefore should only need the sync mediator (and can all benefit from its cache if they do). But what we're doing here has nothing to do with the sync cycle and the mediator gives us no benefit. EjsonSecretProvisioner is an example of another piece of independent functionality that also takes care of its own calls.

We can use kubeclient directly, but it would just be duplication of fetch_by_kind. This didn't seem like using the wrong abstraction, so I didn't duplicate the code.

KnVerey · 2018-08-01T16:58:20Z

test/fixtures/resource-discovery/shredders.yml

@@ -0,0 +1,12 @@
+---


Can we enhance the CRD fixture set you added earlier rather than making another?

KnVerey · 2018-08-01T17:02:03Z

test/fixtures/resource-discovery/widgets_cr.yml

+kind: Widget
+metadata:
+  name: my-first-widget
+status: "ok"


Why include status? Once it is a proper subresource (1.11+), this will be ignored.

KnVerey · 2018-08-01T17:12:09Z

test/integration-serial/resource_discovery_test.rb

+require 'test_helper'
+
+class ResourceDiscoveryTest < KubernetesDeploy::IntegrationTest
+  def test_non_prunable_crd


I'd recommend merging these two tests into a single test that:

deploys CRDs of both kinds and asserts that all is well

deploys the rest of the fixture set (CRs of all kinds + some non-CR) and asserts that results look good

deploys the non-CRs and asserts that the correct CRs were pruned/left alone

Technically that could easily cover the existing CRD success test as well. Since these tests are slow and can be flakey (inherently, because they hit a live minikube), I try to keep tests that are 100% overlapped by others to a minimum. I think we should move those CRD tests to this file regardless.

KnVerey · 2018-08-01T17:17:00Z

test/integration-serial/resource_discovery_test.rb

+    assert_deploy_success(deploy_fixtures("hello-cloud", subset: ["configmap-data.yml"]))
+
+    refute_logs_match("The following resources were pruned: widget \"my-first-shredder\"")
+  end


don't forget to clean up the CRDs. I'm thinking we should make test teardown do this generically by listing and deleting CRDs. That could just be in this suite if we move the other CRD tests over.

KnVerey · 2018-08-01T17:18:37Z

test/integration-serial/resource_discovery_test.rb

+    assert_deploy_success(deploy_fixtures("resource-discovery", subset: ["shredders.yml"]))
+    assert_deploy_success(deploy_fixtures("resource-discovery", subset: ["shredders_cr.yml"]))
+    # Deploy any other non-priority (predeployable) resource to trigger pruning
+    assert_deploy_success(deploy_fixtures("hello-cloud", subset: ["configmap-data.yml"]))


let's add a configmap (or whatever) to the fixture set we're using instead of deploying an unrelated one. The fixture sets are intended to be representing apps, so deploying a different one to the same ns is semantically odd.

KnVerey · 2018-08-01T17:19:23Z

test/fixtures/resource-discovery/widgets.yml

+metadata:
+  name: widgets.api.foobar.com
+  annotations:
+    kubernetes-deploy.shopify.io/prunable: "true"


Should we have a third one that has the annotation, but set to "false"?

I think its overkill, we already test the case where its not pruned.

KnVerey

🚢 after one last test fix

KnVerey · 2018-08-02T00:25:48Z

test/integration-serial/run_serial_test.rb

+    # Deploy any other non-priority (predeployable) resource to trigger pruning
+    assert_deploy_success(deploy_fixtures("crd", subset: %w(configmap-data.yml configmap2.yml)))
+
+    refute_logs_match("The following resources were pruned: widget \"my-first-mail\"")


wouldn't it say mail "my-first-mail" if it had been? This brings up that we should add an assertion that actually confirms the thing is still there by looking at the namespace.

KnVerey · 2018-08-02T00:27:37Z

In case you missed it on my previous review, we'll need docs for this feature. That can be a follow-up if you prefer.

Update readme and changelog

dturn requested a review from KnVerey July 6, 2018 20:40

dturn force-pushed the pruneable-cr branch from 039934c to ebdd963 Compare July 9, 2018 19:42

dturn requested a review from stefanmb July 9, 2018 22:36

dturn commented Jul 9, 2018

View reviewed changes

stefanmb reviewed Jul 12, 2018

View reviewed changes

dturn force-pushed the add-crd branch from 37550e5 to aa030f8 Compare July 16, 2018 23:50

KnVerey reviewed Jul 19, 2018

View reviewed changes

dturn force-pushed the add-crd branch 3 times, most recently from 182b829 to 04734e9 Compare July 21, 2018 00:28

dturn force-pushed the pruneable-cr branch from 21f34ee to b7dea3e Compare July 31, 2018 21:17

dturn changed the base branch from add-crd to master July 31, 2018 21:17

dturn force-pushed the pruneable-cr branch 3 times, most recently from ac762f4 to 426b480 Compare July 31, 2018 21:54

KnVerey reviewed Aug 1, 2018

View reviewed changes

KnVerey approved these changes Aug 2, 2018

View reviewed changes

Add the ability for crd pruning

29032bd

Update readme and changelog

dturn force-pushed the pruneable-cr branch from 806bb3c to 29032bd Compare August 3, 2018 17:51

HPA isn't ready

8867687

dturn merged commit 7e19140 into master Aug 3, 2018

dturn deleted the pruneable-cr branch August 3, 2018 19:02

dturn mentioned this pull request Aug 8, 2018

prune Cloudsql #279

Closed

Pruneable CRs #312

Pruneable CRs #312

Conversation

dturn commented Jul 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefanmb left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dturn commented Jul 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dturn Jul 19, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dturn Jul 19, 2018 • edited Loading

Choose a reason for hiding this comment

dturn commented Jul 31, 2018

KnVerey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dturn Aug 1, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KnVerey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KnVerey commented Aug 2, 2018

stefanmb left a comment •

edited

Loading

dturn Jul 19, 2018 •

edited

Loading

dturn Jul 19, 2018 •

edited

Loading

dturn Aug 1, 2018 •

edited

Loading