Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Metafora v2 - Leader based task assignment; task dependency metadata #134

Open
schmichael opened this issue Jul 23, 2015 · 3 comments
Labels

Comments

@schmichael
Copy link
Contributor

Impetus

Metafora's current peer-based work stealing approach has limitations:

  • In a cluster with nodes A and B, each with 10 tasks, if you start node C, any tasks released (due to the fair balancer) from A or B, are just as likely to be claimed by A or B as C! The CanClaim method alone offers very few tools to mitigate this.
  • Starting a cluster is extremely resource intensive as each node often tries to claim every task.

New Scheduling

  • Use something similar to the current work-stealing scheduling to elect a scheduling leader. Have that leader assign or offer tasks to followers.
  • Task metadata used for scheduling decisions (affinities, anti-affinities, dependencies, resource utilization) could be declarative definitions or a tasks could implement a function that given an Offer determines whether or not it's sufficient for the task to run. (assignment vs. ask/offer)

Open Questions

  • How simple and extensible can we keep metafora and still address scheduler scaling issues?
  • What metadata and constraints are valuable for scheduling decisions?

Topologies

  • Metafora doesn't have them currently. Period.
  • Super complicated stuff, especially if you want to generically handle inter-task communication.

Open Questions

  • Leave topologies up to another library/layer?
  • Treat topologies as scheduling metadata and leave inter-task communication up to users?

These are my personal preferences as I'd rather not try to compete with existing one-size-fits-all topology frameworks like Storm. -- @schmichael

@schmichael schmichael added the RFC label Jul 23, 2015
@mdmarek
Copy link
Contributor

mdmarek commented Jul 23, 2015

My 2¢ on the topology topic is that there are already some very nice go-lang libraries to form those. Both with brokers and without, like NATS or MANGOS. I don't think metafora proper needs to be in that business.

I think it would be tremendous to have this idea of a leader scheduler, that makes the schedule. It's first implementation could be quite simple, and then it could add new features as needed. I think just having this scheduler so that these "large land grabs" could be totally avoided would already be tremendous, and more usable by a larger user base.

@araddon
Copy link
Contributor

araddon commented Jul 23, 2015

couple thoughts:

  • I like the flexibility and how kubernetes uses labels for scheduling metadata
  • is this eventually going to be dependent on kubernetes? ie, can we assume there is a process scheduler underneath of it that can pass through additional labels? what differentiates their responsibilities?

@schmichael
Copy link
Contributor Author

@araddon at least at first I'd like to avoid a hard dependency on k8s as that would be a pretty huge dependency. I should definitely get familiar with their scheduler behavior first though to make sure.

Now I think k8s might make a great source for the scheduler to learn about cluster resources. Maybe someday the scheduler could gain some sort of k8s plugin/sidecar/something that it could use for autoscaling when resources aren't available, but that seems like an easy thing to build as an optional component down the road.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants