Prevent new host overloading #1822

pschoenfelder · 2018-07-17T19:05:02Z

No description provided.

ssalinas

I think the overall strategy of refreshing metrics in the offer scoring for things is good. Definitely want to make sure we only do it when metrics are too old (or maybe also after some number of additional tasks/percent of resources allocated since last metric?) and that we determine what kind of impact this has on scheduling speed

ssalinas · 2018-07-17T19:15:18Z

SingularityService/src/main/java/com/hubspot/singularity/scheduler/SingularityUsagePoller.java

    utilizationPerRequestId.values().forEach(usageManager::saveRequestUtilization);

    if (configuration.isShuffleTasksForOverloadedSlaves()) {
      shuffleTasksOnOverloadedHosts(overLoadedHosts);
    }
  }

+  public CompletableFuture<Void> getSlaveUsage(SingularitySlave slave) {
+    return usageCollectionSemaphore.call(() ->


For the individual method, not sure we want to make it completely async like the larger ones. This method will likely not be called from the same context as the poller itself, so it should probably fall under a different semaphore (e.g. the offer scoring one) if we want it to be async

ssalinas · 2018-07-17T19:17:23Z

...arityService/src/main/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java

+                  && t.getMesosTask().getSlaveId().getValue().equals(offerHolder.getSlaveId()))) {
+            Optional<SingularitySlave> maybeSlave = slaveManager.getSlave(offerHolder.getSlaveId());
+            if (maybeSlave.isPresent()) {
+              usagePoller.getSlaveUsage(maybeSlave.get());


will probably want to put this in currentSlaveUsagesBySlaveId after it's calculated. Calling this alone won't update the underlying values we pass to the scoring functions

Another note, this would be a good thing to stick inside the completable future below. It's a good candidate to make async since we are now io bound on an api call and cpu bound on the calculations

ssalinas

Added a few comments on the async bits. The main point I'm not sure on is whether or not we are kicking off metric collect for use in a future offer scoring run, or gathering metrics synchronously for use in the current offer scoring run. I could see the first being faster for scoring, but we need to be careful that we aren't kicking off a bunch in a row for the same slave (i.e. because the first call hasn't finished yet). The second is more reliable in terms of making sure we have the most up to date metrics, but is slower since we have to wait for them

ssalinas · 2018-07-20T18:43:53Z

...arityService/src/main/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java

@@ -240,6 +238,59 @@ public SingularityMesosOfferScheduler(MesosConfiguration mesosConfiguration,
    return offerHolders.values();
  }

+  private Void buildScoringFuture(


nit on naming, this method isn't actually building the future, it's a synchronous method for doing the actual scoring. I'd either rename this to calculateScore (or something like that), or move the supplyAsync inside this method and have it return the actual CompletableFuture<Void>

ssalinas · 2018-07-20T18:49:44Z

...arityService/src/main/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java

+              }
+          });
+      }
+      return null;


So, I want to mention the choice here, could go either way on this. Currently this looks like the newly updated slave metrics will not be taken into account because we are returning here. Wouldn't we want to continue on to the scoring since we've gathered new metrics and put them in the map that is fed to calculateScore?

Yes, I'll go with the latter option. For some reason I was thinking only the slave usage is necessary but probably safer to update the score too.

ssalinas · 2018-07-20T18:50:34Z

...e/src/main/java/com/hubspot/singularity/mesos/SingularitySlaveUsageWithCalculatedScores.java

@@ -121,6 +129,10 @@ SingularitySlaveUsage getSlaveUsage() {
    return diskInUseScore;
  }

+  long getTimestamp() {


you should be able to just do getSlaveUsage().getTimestamp() instead of having to store it in two places

ssalinas · 2018-07-20T18:53:01Z

SingularityService/src/main/java/com/hubspot/singularity/scheduler/SingularityUsagePoller.java

    utilizationPerRequestId.values().forEach(usageManager::saveRequestUtilization);

    if (configuration.isShuffleTasksForOverloadedSlaves()) {
      shuffleTasksOnOverloadedHosts(overLoadedHosts);
    }
  }

+  public CompletableFuture<SingularitySlaveUsage> getSlaveUsage(SingularitySlave slave) {


The only thing I find as a code smell here is that now the offer scoring flow will rely on the usage semaphore and executor having enough permits/threads. Since within the offer scoring we are already in a block that is executed async, it may be worth calling collectSlaveUsage directly when used from the offer context, to avoid an additional layer of async work

ssalinas · 2018-07-20T18:54:01Z

SingularityService/src/main/java/com/hubspot/singularity/scheduler/SingularityUsagePoller.java

    } catch (Throwable t) {
      String message = String.format("Could not get slave usage for host %s", slave.getHost());
      LOG.error(message, t);
      exceptionNotifier.notify(message, t);
    }
+    return null; // TODO: is this really okay?


I don't think this method is called anywhere else that expects a return value. Could always wrap in an optional to make it more explicit that the result might not be there.

ssalinas · 2018-07-30T16:32:11Z

...arityService/src/main/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java

+                    usage.get().getTimestamp()
+                ));
+              } else {
+                throw new RuntimeException(throwable);


Where is the handling for this runtime exception? We currently aren't calling a get or join for the future created here which causes two issues for us:

calculateScore below may end up being called before the metrics recollection has a chance to run

If a RuntimeException is thrown here it is lost to us since there it will not propagate out of the future and no catch block currently logs it

ssalinas · 2018-07-30T16:34:07Z

...arityService/src/main/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java

+              new ConcurrentHashMap<>(),
+              usageManager.getRequestUtilizations(),
+              new ConcurrentHashMap<>(),
+              new AtomicLong(),


given these arguments will be the same each time, does it make sense to create an overloaded method to handle those bits in UsagePoller instead?

ssalinas · 2018-07-30T16:38:07Z

...arityService/src/main/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java

@@ -180,7 +191,8 @@ public SingularityMesosOfferScheduler(MesosConfiguration mesosConfiguration,
                mesosConfiguration.getScoreUsingSystemLoad(),
                getMaxProbableUsageForSlave(activeTaskIds, requestUtilizations, offerHolders.get(usageWithId.getSlaveId()).getSanitizedHost()),
                mesosConfiguration.getLoad5OverloadedThreshold(),
-                mesosConfiguration.getLoad1OverloadedThreshold()
+                mesosConfiguration.getLoad1OverloadedThreshold(),
+                usageWithId.getTimestamp()


Second thoughts about the setup. Would it make sense to instead collect additional usages in this block instead? I'm realizing that the loop below would be called for each pending task. If we hit a case where collecting a particular slave usage is throwing exceptions or timing out, we will continue to recheck it for each pending task. Whereas, if we check in this block instead, we can just omit that up front and leave the block below as it was previously.

If we move the usage collection here, we'll likely want to convert this from parallelStream to a list of CompletableFutures like below to have better control over the concurrency

ssalinas · 2018-08-06T13:33:33Z

🚢

pschoenfelder · 2018-08-06T13:41:28Z

🚢

…gularity into new-host-overloading

pschoenfelder · 2018-08-06T20:18:47Z

SingularityService/src/main/java/com/hubspot/singularity/scheduler/SingularityUsageHelper.java

+      if (slaveMetricsSnapshot != null) {
+        memoryMbReservedOnSlave = (long) slaveMetricsSnapshot.getSlaveMemUsed();
+        cpuReservedOnSlave = slaveMetricsSnapshot.getSlaveCpusUsed();
+        diskMbReservedOnSlave = (long) slaveMetricsSnapshot.getSlaveDiskUsed();


Are these actually mapped correctly? I know we said /slave/* maps to "reserved", but it also says "used"

pschoenfelder · 2018-08-06T20:19:16Z

SingularityService/src/main/java/com/hubspot/singularity/scheduler/SingularityUsageHelper.java

+      if (slaveMetricsSnapshot != null) {
+        memoryMbReservedOnSlave = (long) slaveMetricsSnapshot.getSlaveMemUsed();
+        cpuReservedOnSlave = slaveMetricsSnapshot.getSlaveCpusUsed();
+        diskMbReservedOnSlave = (long) slaveMetricsSnapshot.getSlaveDiskUsed();


Also, casting doubles to long so we don't have to change the usage pojos? Seems smelly to me

Open to using doubles everywhere. Can't remember which level it was where the longs were required

ssalinas · 2018-08-07T13:39:36Z

SingularityService/src/main/java/com/hubspot/singularity/scheduler/SingularityUsageHelper.java

+        }
+
+        SingularityTaskUsage latestUsage = getUsage(taskUsage);
+        memoryBytesUsedOnSlave += latestUsage.getMemoryTotalBytes();


Will these end up being any different than systemMemTotalBytes - systemMemFreeBytes ? I'm wondering if there isn't actually a need to total everything up for these cases. getUsage is still a zk call and it'd be nice to eliminate if we can. The totaled up by task values and the slave-reported totals seem fairly similar. I'm not certain why we have both in the POJO TBH, would have to look through commit history. (Or maybe @darcatron knows since he wrote the original versions of usage collection?)

pschoenfelder · 2018-08-07T19:09:38Z

SingularityService/src/main/java/com/hubspot/singularity/scheduler/SingularityUsageHelper.java

+      SingularitySlaveUsage slaveUsage = new SingularitySlaveUsage(
+          cpuReservedOnSlave, cpuReservedOnSlave, cpusTotal,
+          memoryMbReservedOnSlave, memoryMbReservedOnSlave, memoryMbTotal,
+          diskMbReservedOnSlave, diskMbReservedOnSlave, diskMbTotal,


Seems pretty funky to send in these same variables, but that's what reducing everything to use the metric snapshot resulted in (see the plain collectSlaveUsage below to compare).

I guess the 'clean' way to do it could be to have a separate POJO with a more minimal set of fields, which gets used in the scheduling. The fuller class, which would extend that, has all of the fields is used in the poller + api

e.g. SingularitySlaveUsage extends SingularitySimpleSlaveUsage {}

Skip hosts which do not have valid metrics during offer processing

baconmania · 2018-08-09T17:52:32Z

🚢

WIP: Prevent new host overloading

3a7a125

pschoenfelder requested a review from ssalinas July 17, 2018 19:05

ssalinas reviewed Jul 17, 2018

View reviewed changes

Rework async stuff per PR

e610cf1

ssalinas requested changes Jul 20, 2018

View reviewed changes

PR changes

339dd44

ssalinas reviewed Jul 30, 2018

View reviewed changes

pschoenfelder added 5 commits July 31, 2018 12:55

Move usage collection loop

d814a9d

Use semaphore

50ba080

Fix tests

39b3535

Resolve dependencies

bee2269

Fix tests

6c02f60

pschoenfelder added hs_staging labels Aug 1, 2018

pschoenfelder and others added 5 commits August 1, 2018 14:35

Add timeout for slave usage checking

fe45d81

Build fix

1414edf

Build fix

4650639

Fix failed data refresh

890504b

Make fewer zk calls for usage fetching

5d7de80

ssalinas changed the title ~~WIP: Prevent new host overloading~~ Prevent new host overloading Aug 6, 2018

ssalinas added the hs_stable label Aug 6, 2018

pschoenfelder added 2 commits August 6, 2018 16:14

Add new method for slave usage

572407e

Merge branch 'new-host-overloading' of https://github.com/HubSpot/Sin…

3612d54

…gularity into new-host-overloading

pschoenfelder commented Aug 6, 2018

View reviewed changes

rm comment

d527799

pschoenfelder added 2 commits August 6, 2018 16:35

longs to doubles

d055ff0

Add test tolerances

0127517

ssalinas reviewed Aug 7, 2018

View reviewed changes

pschoenfelder added 3 commits August 7, 2018 15:05

Remove more zk calls

f4de9ee

Condense duplicate variables

df9c3a0

Fix typo

b1fbeda

pschoenfelder commented Aug 7, 2018

View reviewed changes

ssalinas and others added 7 commits August 8, 2018 09:45

Skip hosts which do not have valid metrics during offer processing

5ec231b

Merge pull request #1828 from HubSpot/skip_host

9ed693f

Skip hosts which do not have valid metrics during offer processing

Add leader/web cache for request utilizatons

b80e8c9

merge request utilization caching

a13440a

new strategy for new host overlaod check

8cae538

fix the test client as well

4ea2166

Add logging

5b822c3

ssalinas added this to the 0.21.0 milestone Aug 9, 2018

add try/catch here

e2e6794

ssalinas merged commit 29c7199 into master Aug 16, 2018

ssalinas deleted the new-host-overloading branch August 16, 2018 12:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent new host overloading #1822

Prevent new host overloading #1822

pschoenfelder commented Jul 17, 2018

ssalinas left a comment

ssalinas Jul 17, 2018

ssalinas Jul 17, 2018

ssalinas Jul 17, 2018

ssalinas left a comment

ssalinas Jul 20, 2018

ssalinas Jul 20, 2018

pschoenfelder Jul 24, 2018

ssalinas Jul 20, 2018

ssalinas Jul 20, 2018

ssalinas Jul 20, 2018

ssalinas Jul 30, 2018

ssalinas Jul 30, 2018

ssalinas Jul 30, 2018

ssalinas commented Aug 6, 2018

pschoenfelder commented Aug 6, 2018

pschoenfelder Aug 6, 2018

pschoenfelder Aug 6, 2018

ssalinas Aug 6, 2018

ssalinas Aug 7, 2018

pschoenfelder Aug 7, 2018

ssalinas Aug 7, 2018

baconmania commented Aug 9, 2018

Prevent new host overloading #1822

Prevent new host overloading #1822

Conversation

pschoenfelder commented Jul 17, 2018

ssalinas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssalinas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssalinas commented Aug 6, 2018

pschoenfelder commented Aug 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

baconmania commented Aug 9, 2018