Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slightly Smarter Task Shuffling #2057

Merged
merged 20 commits into from
Jan 28, 2020
Merged

Slightly Smarter Task Shuffling #2057

merged 20 commits into from
Jan 28, 2020

Conversation

WH77
Copy link
Contributor

@WH77 WH77 commented Jan 13, 2020

The current task shuffling/eviction logic used by Singularity prioritizes shuffling tasks with the highest resource utilization (memory or CPU, depending on which is most overused), on each overcommitted slave. Though this is a good way to ensure the shuffle succeeds at reducing load to acceptable levels, it is also much more likely to shuffle tasks performing large amounts of work or holding large amounts of in-memory state. This makes sense for high CPU tasks, but is undesirable for high memory ones - they'll likely return to their previous memory consumption and possibly trigger another shuffle on their new host.

To address these issues, this PR attempts to:

  • Alter the shuffling logic for memory overusage to favor shuffling "low utilization" tasks.
  • Create a simple scoring metric to identify "low utilization" tasks. Currently, this is just the average of a task's memory/CPU usage. Suggestions on whether to use max, min, or other methods would be welcome.
  • Refactor shuffling logic from SingularityUsagePoller to a new SingularityTaskShuffler class, while preserving existing functionality/signatures.

Because of these changes, it is very likely that tasks that were not shuffled before will now be shuffled, which in turn has a fair chance of causing issues with these tasks in QA/prod. I don't think there's a good way to avoid this and also actually change the shuffling logic, though self-service shuffle opt-out may help mitigate things.

TODO:

  • Add more complex shuffling unit tests.
  • Evaluation in staging cluster.
  • Ensure that enough resources are freed up while respecting shuffle limits in configuration.

Copy link
Member

@ssalinas ssalinas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few smaller comments. Let's get some unit tests in there to prove that the prioritization is working as intended then we can test in staging 👍

System.currentTimeMillis(),
task.getTaskId(),
message,
Optional.of(UUID.randomUUID().toString()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit, it would be good for the actionId here and in the SingularityPendingRequest to match. For this particular case it doesn't have any lasting effect on the actions themselves, but makes it easier to trace through the data for debugging if they are equal

double memoryUtilization = task.getUsage().getMemoryTotalBytes() / task.getRequestedResources().getMemoryMb();
double cpuUtilization = task.getUsage().getCpusUsed() / task.getRequestedResources().getCpus();

return (memoryUtilization + cpuUtilization) / 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still thinking more on the metric for this. You were correct to point out the fact that, by shuffling more/smaller tasks we may very well hit the shuffling limits more often. I'm not certain if it will be often enough to be of consequence given the size of our cluster, but there are 2 things I could see us doing to compensate for it:

  • Possibly make the shuffle limits a percentage of the overall task count (i.e. so we don't have to worry about it being more restrictive as the cluster grows larger)
  • Possibly make a slightly more complex scoring mechanism. I could see a case where there is maybe a threshold for cpu (something simple like a 50-60% cutoff) in memory shuffle. If tasks are below that, they are given a higher priority for high memory usage (inverse of now), if they are above that they are scored as they are now. This could serve to put a few higher memory, but more likely less busy tasks at the top of the queue, resulting in fewer overall shuffles

Copy link
Member

@ssalinas ssalinas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests are on the right track. Left some comments about shortcuts for some of these things we have already written so they don't have to be duplicated here

}
}

protected void scheduleTask(String rqid, double requiredCpus, double requiredMemoryMb) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a few shortcuts available already that should make the stuff in this method easier/shorter, though I'm impressed at all of the pieces you uncovered here :)

  • saveRequest will take care of the requestManager.activate step you have below
  • initAndFinishDeployWithResources will do most of the first section here
  • launchTask can take the request/deploy, instance, and a state you want to get it to, as well as possible hostname args, and could be a possible shortcut for getting these tasks to an active state. A normal flow for a task is similar to what you had, where a pending request is created, drainPendingQueue turns that to a pending task, which then is launched with resource offers.

Copy link
Contributor Author

@WH77 WH77 Jan 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really should've checked for review comments before I kept going, will change to use these methods.

The initAndFinishDeployWithResources method seems to be bound to a single class scope requestid, so I don't think I'll be able to use it directly to support multiple requests per test.

Copy link
Contributor Author

@WH77 WH77 Jan 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since launchTask bypasses the normal task flow, it breaks/complicates testing of relocation of bounced tasks. I'm sure there's a way around this, that also makes these tests less horribly messy, but it might not be worth the trouble.

scheduler.drainPendingQueue();
}

protected Map<String, Map<String, SingularityTaskId>> getTaskIdMapByHostByRequest() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that we really need to build a map of this here. It seems in each case you are just using the getActiveTaskIds to then loop through all of them and get them into an active/starting state which can then be viewed by the usage poller. Can probably just do a for loop on taskManager.getActiveTaskIds() after maybe validating the size of the list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the map is more intuitive than looping through the task ids and matching host/request names, especially when working with multiple hosts, but it's definitely not something that needs to be built. I'd prefer to leave it, but will remove it if you prefer a simpler structure.

@WH77 WH77 added the hs_qa label Jan 15, 2020
@ssalinas
Copy link
Member

🚢

@WH77 WH77 added the hs_stable label Jan 24, 2020
@ssalinas ssalinas merged commit e97d30d into master Jan 28, 2020
@ssalinas ssalinas deleted the smarter-shuffle branch January 28, 2020 02:01
@ssalinas ssalinas added this to the 1.2.0 milestone Jan 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants