Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Scheduling] Expand the ability of resource evaluator #2997

Merged
merged 5 commits into from
May 6, 2022

Conversation

zhongchun
Copy link
Contributor

What do these changes do?

Resource evaluator is used to estimate and set resources required by subtasks. It can be an internal service or an external service. If it is an internal service, we can set default of adjustable resources for subtasks. If it is an external service, we should report the running result of the task to the external service, so that it can accurately predict the required resources of subtasks based on the historical running information, we call it HBO.
But it is not easy to implement a new resource evaluator and config it. This pr introduces an extension point of resource evaluator with which we could add a new evaluator as follows:

  • Inherit ResourceEvaluator and implement create, evaluate and report methods. The create method is to create a new resource evaluator instance. The evaluate method is to estimate and set required resources for the subtasks of a task stage. And this method must be implemented. The report method is to report the running information and result of the task. And this method does not have to be implemented.

  • Add default configs of the new evaluator needed in base_config.xml or its descendant files.

  • Set the resource_evaluator to choose a resource evaluator in base_config.xml when running a mars job.

Related issue number

Fixes #xxxx

Check code requirements

  • tests added / passed (if needed)
  • Ensure all linting tests pass, see here for how to run them

@zhongchun zhongchun changed the title Expand the ability to resource evaluator [Scheduling] Expand the ability to resource evaluator May 5, 2022
Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About resource evaluator, is there a plan about it? Apart from the plan, I left some comments.

mars/services/task/execution/mars/executor.py Outdated Show resolved Hide resolved
mars/services/task/execution/mars/resource.py Outdated Show resolved Hide resolved
@zhongchun
Copy link
Contributor Author

About resource evaluator, is there a plan about it? Apart from the plan, I left some comments.

It is necessary to build an external resource recommendation service that can take advantage of historical job information if we want to fully utilize resource evaluator. Maybe we could add an extension library of mars later.

Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qinxuye qinxuye changed the title [Scheduling] Expand the ability to resource evaluator [Scheduling] Expand the ability of resource evaluator May 6, 2022
Copy link
Contributor

@hekaisheng hekaisheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qinxuye qinxuye merged commit 261eaaf into mars-project:master May 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants