Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SparkLoad]Use the yarn command to get status and kill the application #4383

Merged
merged 14 commits into from
Aug 27, 2020

Conversation

xy720
Copy link
Member

@xy720 xy720 commented Aug 18, 2020

Proposed changes

#4346 #4203
This cl will use yarn command as follows to kill or get status of application running on YARN.

yarn --config confdir application <-kill | -status> <Application ID>

To do
1、 Make yarn command executable in spark load.
2、Write spark resource into config files and update it before running command.
3、Parse the result of executing the command line.

Types of changes

What types of changes does your code introduce to Doris?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have create an issue on (Fix #ISSUE), and have described the bug/feature there in detail
  • Compiling and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if user change the properties in resource, the config file may need to be rewrote again.
we can simply read add check it content every time.

@morningman morningman self-assigned this Aug 18, 2020
@morningman morningman added the area/spark-load Issues or PRs related to the spark load label Aug 18, 2020
// prepare yarn config
String configDir = resource.prepareYarnConfig();
// yarn client path
String yarnClient = Config.yarn_client_path;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to create a function called getYarnClienthPath() and check if the binary file exist in that function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@kangkaisen
Copy link
Contributor

@xy720 Hi, do we have any user doc for this PR?

String line = null;
long startTime = System.currentTimeMillis();
try {
Preconditions.checkState(process.isAlive());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to make sure the process is still alive here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to make sure the process is alive. We can get output even if the process is not alive.

while (!isStop && (line = outReader.readLine()) != null) {
LOG.info("Monitor Log: " + line);
// parse state and appId
if (line.contains(STATE)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add an example output line here, so that the reviewer can know what the line looks like.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}
// parse other values
else if (line.contains(QUEUE) || line.contains(START_TIME) || line.contains(FINAL_STATUS) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the line contains "STATE", the while loop may be broken. So how to guarantee that this else if block can be ran?

Copy link
Member Author

@xy720 xy720 Aug 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state's changing follows the rule of

  1. submited > running > finished / failed
  2. submitted > killed
  3. submitted > running > killed
    Normally, the laucher will periodically print the queuestart timefinal statustracking urluser logs in state submitted/runnning. So in case 2, the else if block may still not be ran when the while loop be broken.
    But it's not very terrible that this else if block not be ran, beacuse the necessary value we need only contains appId and state. In this case, queuestart timefinal statustracking urluser is just missing.

@xy720
Copy link
Member Author

xy720 commented Aug 26, 2020

I will add some relevant user doc later

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class SparkLauncherMonitors {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like the xxxxs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about SparkLauncherMonitor?

/**
* Default yarn client path
*/
public static String yarn_client_path = PaloFe.DORIS_HOME_DIR + "/lib/yarn-client/hadoop/bin/yarn";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ConfField

private static final String YARN_STATUS_CMD = "%s --config %s application -status %s";
private static final String YARN_KILL_CMD = "%s --config %s application -kill %s";

class SparkAppListener implements SparkLoadAppHandle.Listener {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used?

@@ -243,6 +264,16 @@ protected void setProperties(Map<String, String> properties) throws DdlException
return sparkConfigs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return sparkConfigs;
return sparkConfig;

The method name also need changed?

}

try {
report.setApplicationId(ConverterUtils.toApplicationId(reportMap.get(APPLICATION_ID)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format seems to wrong?

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Please add doc for config and usage in next PR.

@morningman morningman added the approved Indicates a PR has been approved by one committer. label Aug 26, 2020
@morningman morningman merged commit 8c38c79 into apache:master Aug 27, 2020
@yangzhg yangzhg mentioned this pull request Feb 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. area/spark-load Issues or PRs related to the spark load
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants