Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement][Task] Improved way to collect yarn job's appIds #12197

Merged
merged 40 commits into from
Oct 31, 2022
Merged
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
5026d79
support aop fetch appIds
Radeity Sep 28, 2022
e8ef5a6
[Improvement][Task] Improved way to collect yarn job's appIds
Radeity Sep 28, 2022
7b67676
Merge branch 'apache:dev' into Improvement-11262
Radeity Sep 28, 2022
4d2355f
Update common.properties
Radeity Sep 28, 2022
21bee4b
Update common.properties
Radeity Sep 28, 2022
059a414
Update configuration docs
Radeity Sep 28, 2022
212a9e8
Merge branch 'apache:dev' into Improvement-11262
Radeity Sep 29, 2022
f5525c7
Merge branch 'dev' of https://github.com/apache/dolphinscheduler into…
Radeity Sep 29, 2022
ea2ddfb
Merge branch 'Improvement-11262' of https://github.com/Radeity/dolphi…
Radeity Sep 29, 2022
43824ee
add license
Radeity Sep 29, 2022
caf5451
update user properties
Radeity Oct 10, 2022
b263c7c
remove redundant dependencies
Radeity Oct 10, 2022
088afb3
update architecture doc
Radeity Oct 10, 2022
88095dd
clean the code
Radeity Oct 16, 2022
db65756
update log statement
Radeity Oct 16, 2022
a26883d
fix some bugs
Radeity Oct 17, 2022
094e6e0
Update pom.xml
Radeity Oct 19, 2022
45695ae
Merge branch 'dev' into Improvement-11262
Radeity Oct 19, 2022
62087d8
Update pom.xml
Radeity Oct 19, 2022
458573d
add ut & licsence
Radeity Oct 20, 2022
cec9f60
update aop code
Radeity Oct 20, 2022
da447a2
add ut's license header
Radeity Oct 21, 2022
5f282b4
add aspectjrt license
Radeity Oct 21, 2022
f59dc7e
alter aop dependency version
Radeity Oct 21, 2022
355d94f
fix some bugs & handle conflicts
Radeity Oct 24, 2022
7ff47f7
remove unused file
Radeity Oct 24, 2022
94dfe46
rename import package
Radeity Oct 24, 2022
224adcb
exclude redundant dependencies
Radeity Oct 26, 2022
f45d527
use logger to output debug info
Radeity Oct 26, 2022
0ee40c8
add logger configuration
Radeity Oct 27, 2022
3355270
Update dolphinscheduler-aop/src/main/java/org/apache/dolphinscheduler…
Radeity Oct 27, 2022
d4ad037
Update dolphinscheduler-aop/src/main/java/org/apache/dolphinscheduler…
Radeity Oct 27, 2022
a5865a7
Update dolphinscheduler-aop/src/test/java/org/apache/dolphinscheduler…
Radeity Oct 27, 2022
3ebe4e9
Update dolphinscheduler-aop/src/test/java/org/apache/dolphinscheduler…
Radeity Oct 27, 2022
6781870
update log format & delete buildExecPathRelatedInfo
Radeity Oct 27, 2022
9b70033
delete dependency scope label
Radeity Oct 27, 2022
f2282e4
improve code quality
Radeity Oct 28, 2022
5c6f72b
remove useless code
Radeity Oct 28, 2022
7e0e226
replace switch statement
Radeity Oct 28, 2022
9ca85c3
add docs
Radeity Oct 31, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,10 @@ export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
export DATAX_HOME=${DATAX_HOME:-/opt/soft/datax}

export PATH=$HADOOP_HOME/bin:$SPARK_HOME/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_HOME/bin:$PATH

# applicationId auto collection related configuration
export HADOOP_CLASSPATH=`hadoop classpath`:${DOLPHINSCHEDULER_HOME}/tools/libs/*
export SPARK_DIST_CLASSPATH=$HADOOP_CLASSPATH:$SPARK_DIST_CLASS_PATH
export HADOOP_CLIENT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$HADOOP_CLIENT_OPTS
export SPARK_SUBMIT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$SPARK_SUBMIT_OPTS
export FLINK_ENV_JAVA_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$FLINK_ENV_JAVA_OPTS
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,10 @@ export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
export DATAX_HOME=${DATAX_HOME:-/opt/soft/datax}

export PATH=$HADOOP_HOME/bin:$SPARK_HOME/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_HOME/bin:$PATH

# applicationId auto collection related configuration
export HADOOP_CLASSPATH=`hadoop classpath`:${DOLPHINSCHEDULER_HOME}/tools/libs/*
export SPARK_DIST_CLASSPATH=$HADOOP_CLASSPATH:$SPARK_DIST_CLASS_PATH
export HADOOP_CLIENT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$HADOOP_CLIENT_OPTS
export SPARK_SUBMIT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$SPARK_SUBMIT_OPTS
export FLINK_ENV_JAVA_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$FLINK_ENV_JAVA_OPTS
3 changes: 3 additions & 0 deletions deploy/kubernetes/dolphinscheduler/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,9 @@ conf:
# mlflow task plugin preset repository version
ml.mlflow.preset_repository_version: "main"

# way to collect applicationId: log, aop
appId.collect: log

common:
## Configmap
configmap:
Expand Down
8 changes: 8 additions & 0 deletions docs/docs/en/architecture/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,7 @@ The default configuration is as follows:
|sudo.enable | true | whether to enable sudo|
|alert.rpc.port | 50052 | the RPC port of Alert Server|
|zeppelin.rest.url | http://localhost:8080 | the RESTful API url of zeppelin|
|appId.collect | log | way to collect applicationId, if use aop, alter the configuration from log to aop. Note: Aop way doesn't support submitting yarn job on remote host by client mode like Beeline, and will failure if override applicationId collection-related environment configuration in dolphinscheduler_env.sh, and .|

### Api-server related configuration

Expand Down Expand Up @@ -353,6 +354,13 @@ export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
export DATAX_HOME=${DATAX_HOME:-/opt/soft/datax}

export PATH=$HADOOP_HOME/bin:$SPARK_HOME/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_HOME/bin:$PATH

# applicationId auto collection related configuration
export HADOOP_CLASSPATH=`hadoop classpath`:${DOLPHINSCHEDULER_HOME}/tools/libs/*
export SPARK_DIST_CLASSPATH=$HADOOP_CLASSPATH:$SPARK_DIST_CLASS_PATH
export HADOOP_CLIENT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$HADOOP_CLIENT_OPTS
export SPARK_SUBMIT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$SPARK_SUBMIT_OPTS
export FLINK_ENV_JAVA_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$FLINK_ENV_JAVA_OPTS
```

### Log related configuration
Expand Down
3 changes: 3 additions & 0 deletions docs/docs/en/guide/resource/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,9 @@ conda.path=/opt/anaconda3/etc/profile.d/conda.sh

# Task resource limit state
task.resource.limit.state=false

# way to collect applicationId: log(original regex match), aop
appId.collect: log
```

> **Note:**
Expand Down
10 changes: 9 additions & 1 deletion docs/docs/zh/architecture/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ DolphinScheduler同样可以通过`bin/env/dolphinscheduler_env.sh`进行Zookeep

## common.properties [hadoop、s3、yarn配置]

common.properties配置文件目前主要是配置hadoop/s3/yarn相关的配置,配置文件位置:
common.properties配置文件目前主要是配置hadoop/s3/yarn/applicationId收集相关的配置,配置文件位置:
|服务名称| 配置文件 |
|--|--|
|Master Server | `master-server/conf/common.properties`|
Expand Down Expand Up @@ -221,6 +221,7 @@ common.properties配置文件目前主要是配置hadoop/s3/yarn相关的配置
|sudo.enable | true | 是否开启sudo|
|alert.rpc.port | 50052 | Alert Server的RPC端口|
|zeppelin.rest.url | http://localhost:8080 | zeppelin RESTful API 接口地址|
|appId.collect | log | 收集applicationId方式, 如果用aop方法,将配置log替换为aop,注意:aop不支持远程主机提交yarn作业的方式比如Beeline客户端提交,且如果用户环境覆盖了dolphinscheduler_env.sh收集applicationId相关环境变量配置,aop方法会失效|

## Api-server相关配置

Expand Down Expand Up @@ -345,6 +346,13 @@ export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
export DATAX_HOME=${DATAX_HOME:-/opt/soft/datax}

export PATH=$HADOOP_HOME/bin:$SPARK_HOME/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_HOME/bin:$PATH

# applicationId auto collection related configuration
export HADOOP_CLASSPATH=`hadoop classpath`:${DOLPHINSCHEDULER_HOME}/tools/libs/*
export SPARK_DIST_CLASSPATH=$HADOOP_CLASSPATH:$SPARK_DIST_CLASS_PATH
export HADOOP_CLIENT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$HADOOP_CLIENT_OPTS
export SPARK_SUBMIT_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$SPARK_SUBMIT_OPTS
export FLINK_ENV_JAVA_OPTS="-javaagent:${DOLPHINSCHEDULER_HOME}/tools/libs/aspectjweaver-1.9.7.jar":$FLINK_ENV_JAVA_OPTS
```

## 日志相关配置
Expand Down
3 changes: 3 additions & 0 deletions docs/docs/zh/guide/resource/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,9 @@ development.state=false

# rpc port
alert.rpc.port=50052

# way to collect applicationId: log(original regex match), aop
appId.collect: log
```

> **注意**:
Expand Down
91 changes: 91 additions & 0 deletions dolphinscheduler-aop/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Licensed to the Apache Software Foundation (ASF) under one or more
~ contributor license agreements. See the NOTICE file distributed with
~ this work for additional information regarding copyright ownership.
~ The ASF licenses this file to You under the Apache License, Version 2.0
~ (the "License"); you may not use this file except in compliance with
~ the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.dolphinscheduler</groupId>
<artifactId>dolphinscheduler</artifactId>
<version>dev-SNAPSHOT</version>
</parent>
<artifactId>dolphinscheduler-aop</artifactId>
<packaging>jar</packaging>
<name>${project.artifactId}</name>
<description>aop 4 YarnClient to get application id when submitting jars using 'yarn jar mainClass args'</description>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<aspectj.version>1.9.7</aspectj.version>
<hadoop.version>3.2.4</hadoop.version>
Radeity marked this conversation as resolved.
Show resolved Hide resolved
</properties>

<dependencies>
<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjweaver</artifactId>
<version>${aspectj.version}</version>
<scope>runtime</scope>
</dependency>

<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjrt</artifactId>
<version>${aspectj.version}</version>
</dependency>

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-client</artifactId>
<version>${hadoop.version}</version>
</dependency>

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>aspectj-maven-plugin</artifactId>
<configuration>
<complianceLevel>1.8</complianceLevel>
<source>1.8</source>
<target>1.8</target>
<showWeaveInfo>true</showWeaveInfo>
<verbose>true</verbose>
<Xlint>ignore</Xlint>
<encoding>UTF-8</encoding>
</configuration>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>test-compile</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.dolphinscheduler.aop;

import org.apache.hadoop.yarn.api.records.ApplicationId;
import org.apache.hadoop.yarn.api.records.ApplicationReport;
import org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
import java.util.Collections;

import org.aspectj.lang.annotation.AfterReturning;
import org.aspectj.lang.annotation.Aspect;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

@Aspect
public class YarnClientAspect {

/**
* The current application report when application submitted successfully
*/
private ApplicationReport currentApplicationReport = null;

private final String appInfoFilePath;
private boolean debug;

protected final Logger logger = LoggerFactory.getLogger(getClass());

public YarnClientAspect() {
appInfoFilePath = String.format("%s/%s", System.getProperty("user.dir"), "appInfo.log");
debug = true;
Radeity marked this conversation as resolved.
Show resolved Hide resolved
}

/**
* Trigger submitApplication when invoking YarnClientImpl.submitApplication
*
* @param appContext application context when invoking YarnClientImpl.submitApplication
* @param submittedAppId the submitted application id returned by YarnClientImpl.submitApplication
* @throws Throwable exceptions
*/
@AfterReturning(pointcut = "execution(ApplicationId org.apache.hadoop.yarn.client.api.impl.YarnClientImpl." +
"submitApplication(ApplicationSubmissionContext)) && args(appContext)", returning = "submittedAppId", argNames = "appContext,submittedAppId")
public void registerApplicationInfo(ApplicationSubmissionContext appContext, ApplicationId submittedAppId) {
if (appInfoFilePath != null) {
try {
Files.write(Paths.get(appInfoFilePath),
Collections.singletonList(submittedAppId.toString()),
StandardOpenOption.CREATE,
StandardOpenOption.WRITE,
StandardOpenOption.APPEND);
} catch (IOException ioException) {
logger.error("YarnClientAspect[registerAppInfo]: can't output current application information, because "
+ ioException.getMessage());
}
}
if (debug) {
logger.info("YarnClientAspect[submitApplication]: current application context " + appContext);
logger.info("YarnClientAspect[submitApplication]: submitted application id " + submittedAppId);
logger.info(
"YarnClientAspect[submitApplication]: current application report " + currentApplicationReport);
}
Radeity marked this conversation as resolved.
Show resolved Hide resolved
}

/**
* Trigger getAppReport only when invoking getApplicationReport within submitApplication
* This method will invoke many times, however, the last ApplicationReport instance assigned to currentApplicationReport
*
* @param appReport current application report when invoking getApplicationReport within submitApplication
* @param appId current application id, which is the parameter of getApplicationReport
* @throws Throwable exceptions
*/
@AfterReturning(pointcut = "cflow(execution(ApplicationId org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(ApplicationSubmissionContext))) "
+
"&& !within(CfowAspect) && execution(ApplicationReport org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(ApplicationId)) && args(appId)", returning = "appReport", argNames = "appReport,appId")
public void registerApplicationReport(ApplicationReport appReport, ApplicationId appId) {

Check notice

Code scanning / CodeQL

Useless parameter

The parameter appId is unused.
currentApplicationReport = appReport;
}
}
26 changes: 26 additions & 0 deletions dolphinscheduler-aop/src/main/resources/META-INF/aop.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
<!--
~ Licensed to the Apache Software Foundation (ASF) under one or more
~ contributor license agreements. See the NOTICE file distributed with
~ this work for additional information regarding copyright ownership.
~ The ASF licenses this file to You under the Apache License, Version 2.0
~ (the "License"); you may not use this file except in compliance with
~ the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
-->

<!DOCTYPE aspectj PUBLIC "-//AspectJ//DTD//EN" "http://www.eclipse.org/aspectj/dtd/aspectj.dtd">
<aspectj>
<aspects>
<aspect name="org.apache.dolphinscheduler.aop.YarnClientAspect"/>
<weaver options="-verbose -showWeaveInfo">
<include within="org.apache.hadoop.yarn.client.api.impl..*"/>
</weaver>
</aspects>
</aspectj>
22 changes: 22 additions & 0 deletions dolphinscheduler-aop/src/main/resources/log4j.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

log4j.rootLogger=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss.SSS Z} %-5p [%c] - %m%n
Loading