Forbid health checks longer than kill time #1498

PtrTeixeira · 2017-04-10T19:21:37Z

Singularity is configured to automatically kill tasks which are
nonresponsive for a certain amount of time after startup
(killAfterTasksDoNotRunDefaultSeconds, set in the config file). If a
task is configured such that it won't start accepting healthchecks for
longer than that interval, Singularity will just kill it off, without
ever sending health checks. This makes it so that that task
configuration will be rejected ahead of time by Singularity.

Singularity is configured to automatically kill tasks which are nonresponsive for a certain amount of time after startup (`killAfterTasksDoNotRunDefaultSeconds`, set in the config file). If a task is configured such that it won't start accepting healthchecks for longer than that interval, Singularity will just kill it off, without ever sending health checks. This makes it so that that task configuration will be rejected ahead of time by Singularity.

ssalinas · 2017-04-10T19:57:43Z

SingularityService/src/main/java/com/hubspot/singularity/data/SingularityValidator.java

+      int startUpDelay = deploy.getHealthcheck().get().getStartupDelaySeconds().get();
+
+      checkBadRequest(startUpDelay < defaultKillAfterNotHealthySeconds,
+          String.format("Health check startup delay time must be less than kill after wait time %s (was %s)", defaultKillAfterNotHealthySeconds, startUpDelay));


probably not as important for the user to know that we call it the 'kill after wait time'. Maybe just Health check startup delay time must be less than {time} seconds (was {time} seconds)

Other than that LGTM

There were two fields on the new deploy form that were labeled "HC startup delay." This renames one of them to "HC startup timeout." They bot functioned correctly and pointed to the the correct fields in the deploy JSON; the label on one of them was just mixed up.

Was previously giving perhaps too much information to the user (ie, that the upper limit was coming from how long we would wait to kill a task that didn't appear to be starting). Not it just reflects the maximum amount of time that you are allowed to put down.

matush-v · 2017-04-12T14:39:16Z

SingularityService/src/test/java/com/hubspot/singularity/data/ValidatorTest.java

+
+    WebApplicationException exn = (WebApplicationException) catchThrowable(() -> validator.checkDeploy(request, deploy));
+    assertThat((String) exn.getResponse().getEntity())
+        .contains("Health check startup delay");


whoa, this catch is very cool

ssalinas reviewed Apr 10, 2017

View reviewed changes

PtrTeixeira added 2 commits April 11, 2017 09:00

Modify error message

4936769

Was previously giving perhaps too much information to the user (ie, that the upper limit was coming from how long we would wait to kill a task that didn't appear to be starting). Not it just reflects the maximum amount of time that you are allowed to put down.

matush-v reviewed Apr 12, 2017

View reviewed changes

PtrTeixeira added the hs_staging label Apr 14, 2017

ssalinas modified the milestone: 0.15.0 Apr 19, 2017

PtrTeixeira added hs_qa labels Apr 21, 2017

ssalinas merged commit d9515d8 into master Apr 26, 2017

ssalinas deleted the forbid-impossible-healthcheck-delay branch April 26, 2017 13:08

ssalinas removed hs_qa labels Apr 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forbid health checks longer than kill time #1498

Forbid health checks longer than kill time #1498

PtrTeixeira commented Apr 10, 2017

ssalinas Apr 10, 2017

matush-v Apr 12, 2017 •

edited

Loading

Forbid health checks longer than kill time #1498

Forbid health checks longer than kill time #1498

Conversation

PtrTeixeira commented Apr 10, 2017

ssalinas Apr 10, 2017

Choose a reason for hiding this comment

matush-v Apr 12, 2017 • edited Loading

Choose a reason for hiding this comment

matush-v Apr 12, 2017 •

edited

Loading