Slightly Smarter Task Shuffling #2057

WH77 · 2020-01-13T15:00:16Z

The current task shuffling/eviction logic used by Singularity prioritizes shuffling tasks with the highest resource utilization (memory or CPU, depending on which is most overused), on each overcommitted slave. Though this is a good way to ensure the shuffle succeeds at reducing load to acceptable levels, it is also much more likely to shuffle tasks performing large amounts of work or holding large amounts of in-memory state. This makes sense for high CPU tasks, but is undesirable for high memory ones - they'll likely return to their previous memory consumption and possibly trigger another shuffle on their new host.

To address these issues, this PR attempts to:

Alter the shuffling logic for memory overusage to favor shuffling "low utilization" tasks.
Create a simple scoring metric to identify "low utilization" tasks. Currently, this is just the average of a task's memory/CPU usage. Suggestions on whether to use max, min, or other methods would be welcome.
Refactor shuffling logic from SingularityUsagePoller to a new SingularityTaskShuffler class, while preserving existing functionality/signatures.

Because of these changes, it is very likely that tasks that were not shuffled before will now be shuffled, which in turn has a fair chance of causing issues with these tasks in QA/prod. I don't think there's a good way to avoid this and also actually change the shuffling logic, though self-service shuffle opt-out may help mitigate things.

TODO:

Add more complex shuffling unit tests.
Evaluation in staging cluster.
Ensure that enough resources are freed up while respecting shuffle limits in configuration.

ssalinas

Just a few smaller comments. Let's get some unit tests in there to prove that the prioritization is working as intended then we can test in staging 👍

ssalinas · 2020-01-13T15:06:17Z

SingularityService/src/main/java/com/hubspot/singularity/scheduler/SingularityTaskShuffler.java

+        System.currentTimeMillis(),
+        task.getTaskId(),
+        message,
+        Optional.of(UUID.randomUUID().toString()),


Small nit, it would be good for the actionId here and in the SingularityPendingRequest to match. For this particular case it doesn't have any lasting effect on the actions themselves, but makes it easier to trace through the data for debugging if they are equal

ssalinas · 2020-01-13T15:22:54Z

SingularityService/src/main/java/com/hubspot/singularity/scheduler/SingularityTaskShuffler.java

+    double memoryUtilization = task.getUsage().getMemoryTotalBytes() / task.getRequestedResources().getMemoryMb();
+    double cpuUtilization = task.getUsage().getCpusUsed() / task.getRequestedResources().getCpus();
+
+    return (memoryUtilization + cpuUtilization) / 2;


Still thinking more on the metric for this. You were correct to point out the fact that, by shuffling more/smaller tasks we may very well hit the shuffling limits more often. I'm not certain if it will be often enough to be of consequence given the size of our cluster, but there are 2 things I could see us doing to compensate for it:

Possibly make the shuffle limits a percentage of the overall task count (i.e. so we don't have to worry about it being more restrictive as the cluster grows larger)

Possibly make a slightly more complex scoring mechanism. I could see a case where there is maybe a threshold for cpu (something simple like a 50-60% cutoff) in memory shuffle. If tasks are below that, they are given a higher priority for high memory usage (inverse of now), if they are above that they are scored as they are now. This could serve to put a few higher memory, but more likely less busy tasks at the top of the queue, resulting in fewer overall shuffles

…urce threshold

ssalinas

Unit tests are on the right track. Left some comments about shortcuts for some of these things we have already written so they don't have to be duplicated here

ssalinas · 2020-01-14T14:25:38Z

SingularityService/src/test/java/com/hubspot/singularity/scheduler/SingularityUsageTest.java

+    }
+  }
+
+  protected void scheduleTask(String rqid, double requiredCpus, double requiredMemoryMb) {


we have a few shortcuts available already that should make the stuff in this method easier/shorter, though I'm impressed at all of the pieces you uncovered here :)

saveRequest will take care of the requestManager.activate step you have below

initAndFinishDeployWithResources will do most of the first section here

launchTask can take the request/deploy, instance, and a state you want to get it to, as well as possible hostname args, and could be a possible shortcut for getting these tasks to an active state. A normal flow for a task is similar to what you had, where a pending request is created, drainPendingQueue turns that to a pending task, which then is launched with resource offers.

I really should've checked for review comments before I kept going, will change to use these methods.

The initAndFinishDeployWithResources method seems to be bound to a single class scope requestid, so I don't think I'll be able to use it directly to support multiple requests per test.

Since launchTask bypasses the normal task flow, it breaks/complicates testing of relocation of bounced tasks. I'm sure there's a way around this, that also makes these tests less horribly messy, but it might not be worth the trouble.

ssalinas · 2020-01-14T14:26:44Z

SingularityService/src/test/java/com/hubspot/singularity/scheduler/SingularityUsageTest.java

+    scheduler.drainPendingQueue();
+  }
+
+  protected Map<String, Map<String, SingularityTaskId>> getTaskIdMapByHostByRequest() {


I don't know that we really need to build a map of this here. It seems in each case you are just using the getActiveTaskIds to then loop through all of them and get them into an active/starting state which can then be viewed by the usage poller. Can probably just do a for loop on taskManager.getActiveTaskIds() after maybe validating the size of the list

I feel the map is more intuitive than looping through the task ids and matching host/request names, especially when working with multiple hosts, but it's definitely not something that needs to be built. I'd prefer to leave it, but will remove it if you prefer a simpler structure.

…test with CPU

ssalinas · 2020-01-24T14:08:11Z

🚢

William Hou added 7 commits January 9, 2020 16:36

shuffle lowest memory utilization tasks instead of highest

4f7caa7

modify unit test to reflect new shuffling logic

7cfd013

slight refactor, postpone larger refactor

f21141a

raw refactor of shuffle logic

99b1770

cleanup shuffle refactor, wire into SingularityUsagePoller

48e12a0

method consistency for CPU/memory shuffles

bfcae86

resolve checkstyle violations

e5fbb6b

ssalinas reviewed Jan 13, 2020

View reviewed changes

William Hou added 6 commits January 13, 2020 11:18

uniform action id per task shuffle

54ef4bc

minimal viable shuffle unit tests

48c385f

fix off by one error in shuffle task count check

0b9eafa

actually count initial # of shuffling tasks per host

1881eb8

correctly update overusage, attempt to ensure that slave reaches reso…

2d6dae8

…urce threshold

rearrange helper test methods

c3ebb94

ssalinas reviewed Jan 14, 2020

View reviewed changes

William Hou added 3 commits January 14, 2020 09:49

fix bug in initial CPU utilization calculation + memory shuffle unit …

5c03641

…test with CPU

more thorough shuffle unit test

0e344c8

attempts to use helper methods in shuffle unit testing have failed

335c3c0

WH77 added the hs_staging label Jan 14, 2020

William Hou added 4 commits January 14, 2020 15:26

memory shuffle should take into account external memory pressure

7a5ba4f

add slightly more logging if shuffling >= 1 slave

52d3dca

sanity check logging, remove after sanity regained

3745584

move shuffle debug logging

40d3b59

WH77 added the hs_qa label Jan 15, 2020

WH77 added the hs_stable label Jan 24, 2020

ssalinas merged commit e97d30d into master Jan 28, 2020

ssalinas deleted the smarter-shuffle branch January 28, 2020 02:01

ssalinas added this to the 1.2.0 milestone Jan 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slightly Smarter Task Shuffling #2057

Slightly Smarter Task Shuffling #2057

WH77 commented Jan 13, 2020

ssalinas left a comment

ssalinas Jan 13, 2020

ssalinas Jan 13, 2020

ssalinas left a comment

ssalinas Jan 14, 2020

WH77 Jan 14, 2020 •

edited

Loading

WH77 Jan 14, 2020 •

edited

Loading

ssalinas Jan 14, 2020

WH77 Jan 14, 2020

ssalinas commented Jan 24, 2020

Slightly Smarter Task Shuffling #2057

Slightly Smarter Task Shuffling #2057

Conversation

WH77 commented Jan 13, 2020

ssalinas left a comment

Choose a reason for hiding this comment

ssalinas Jan 13, 2020

Choose a reason for hiding this comment

ssalinas Jan 13, 2020

Choose a reason for hiding this comment

ssalinas left a comment

Choose a reason for hiding this comment

ssalinas Jan 14, 2020

Choose a reason for hiding this comment

WH77 Jan 14, 2020 • edited Loading

Choose a reason for hiding this comment

WH77 Jan 14, 2020 • edited Loading

Choose a reason for hiding this comment

ssalinas Jan 14, 2020

Choose a reason for hiding this comment

WH77 Jan 14, 2020

Choose a reason for hiding this comment

ssalinas commented Jan 24, 2020

WH77 Jan 14, 2020 •

edited

Loading

WH77 Jan 14, 2020 •

edited

Loading