Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug][plugin]Fix: Correct the way to determine the yarn queue in Flink CommandLine and SQL mode #14237

Merged
merged 7 commits into from
Jun 12, 2023

Conversation

ORuteMa
Copy link
Contributor

@ORuteMa ORuteMa commented May 30, 2023

Purpose of the pull request

Correct the way to determine the yarn queue in Flink CommandLine.

closed #14236

Brief change log

In Flink command line with -t option, Yarn queue should be determined by -Dyarn.application.name=%s rather than -yqu.

Verify this pull request

This pull request is code cleanup without any test coverage. I test it manually.
image

@zhongjiajie
Copy link
Member

run ci

@codecov-commenter
Copy link

codecov-commenter commented May 30, 2023

Codecov Report

Merging #14237 (065ed18) into dev (b8748e2) will increase coverage by 0.04%.
The diff coverage is 55.84%.

❗ Current head 065ed18 differs from pull request most recent head 406ce08. Consider uploading reports for the commit 406ce08 to get more accurate results

@@             Coverage Diff              @@
##                dev   #14237      +/-   ##
============================================
+ Coverage     38.39%   38.43%   +0.04%     
- Complexity     4478     4502      +24     
============================================
  Files          1229     1235       +6     
  Lines         42936    43001      +65     
  Branches       4763     4767       +4     
============================================
+ Hits          16485    16528      +43     
- Misses        24625    24646      +21     
- Partials       1826     1827       +1     
Impacted Files Coverage Δ
...che/dolphinscheduler/api/python/PythonGateway.java 16.94% <0.00%> (ø)
...er/api/service/impl/MetricsCleanUpServiceImpl.java 12.50% <0.00%> (-1.79%) ⬇️
...api/service/impl/ProcessDefinitionServiceImpl.java 35.35% <0.00%> (ø)
...cheduler/common/constants/DataSourceConstants.java 0.00% <ø> (ø)
...in/datasource/vertica/VerticaDataSourceClient.java 0.00% <0.00%> (ø)
.../server/master/runner/StateWheelExecuteThread.java 0.50% <ø> (+0.10%) ⬆️
.../server/master/runner/WorkflowExecuteRunnable.java 10.33% <ø> (+0.01%) ⬆️
.../org/apache/dolphinscheduler/spi/enums/DbType.java 0.00% <0.00%> (ø)
...hinscheduler/plugin/task/flink/FlinkConstants.java 0.00% <ø> (ø)
...ler/server/worker/metrics/WorkerServerMetrics.java 0.00% <0.00%> (ø)
... and 15 more

... and 4 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ORuteMa ORuteMa changed the title Fix: Correct the way to determine the yarn queue in Flink CommandLine [Bug][Plugin]Fix: Correct the way to determine the yarn queue in Flink CommandLine May 30, 2023
Comment on lines -257 to -263
if (StringUtils.isEmpty(others) || !others.contains(FlinkConstants.FLINK_QUEUE)) {
String queue = flinkParameters.getQueue();
if (StringUtils.isNotEmpty(queue)) { // -yqu
args.add(FlinkConstants.FLINK_QUEUE);
args.add(queue);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should add if-else statement here to set different arg name for yarn queue according to Flink version. Please remove judgement from L195-L244, in which just build run command.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should add if-else statement here to set different arg name for yarn queue according to Flink version. Please remove judgement from L195-L244, in which just build run command.

done, and fix the same wrong in sql mode, pls have a look

@ORuteMa ORuteMa changed the title [Bug][Plugin]Fix: Correct the way to determine the yarn queue in Flink CommandLine [Bug][Plugin]Fix: Correct the way to determine the yarn queue in Flink CommandLine and SQL mode May 30, 2023
@@ -197,16 +196,19 @@ private static List<String> buildRunCommandLineForOthers(TaskExecutionContext ta
args.add(FlinkConstants.FLINK_RUN); // run
args.add(FlinkConstants.FLINK_EXECUTION_TARGET); // -t
args.add(FlinkConstants.FLINK_YARN_PER_JOB); // yarn-per-job

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove redundant blank line.

} else {
args.add(FlinkConstants.FLINK_RUN); // run
args.add(FlinkConstants.FLINK_RUN_MODE); // -m
args.add(FlinkConstants.FLINK_YARN_CLUSTER); // yarn-cluster

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

}
break;
case APPLICATION:
args.add(FlinkConstants.FLINK_RUN_APPLICATION); // run-application
args.add(FlinkConstants.FLINK_EXECUTION_TARGET); // -t
args.add(FlinkConstants.FLINK_YARN_APPLICATION); // yarn-application

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Comment on lines -167 to +168
String others = flinkParameters.getOthers();
if (StringUtils.isEmpty(others) || !others.contains(FlinkConstants.FLINK_QUEUE)) {
String queue = flinkParameters.getQueue();
if (StringUtils.isNotEmpty(queue)) {
initOptions.add(String.format(FlinkConstants.FLINK_FORMAT_YARN_APPLICATION_QUEUE, queue));
}
String queue = flinkParameters.getQueue();
if (StringUtils.isNotEmpty(queue)) {
Copy link
Member

@Radeity Radeity May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will check whether user defines this arg by themselves in others, so you just have to modify condition in the original way: !others.contains(FlinkConstants. FLINK_QUEUE_FOR_TARGETS), btw, may I ask why you name them FLINK_QUEUE_FOR_MODE and FLINK_QUEUE_FOR_TARGETS ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The YARN queue should be assigned by property yarn.application.queue rather than -yqu option, this option is not available in sql-client.sh. If we want to specify the YARN queue used by a specific Flink SQL task, it would be more appropriate to have an explicit queue option in the task submission form rather than relying on parameters in the 'others' section. Regarding the naming here, it is because in flink-run, the -yqu option only takes effect within the -m option (for mode). When using the -t option (for target), it is necessary to specify it using -Dyarn.application.queue=%s.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The execution of a workflow is tied to a specific tenant, and this tenant holds a queue attribute. This attribute is assigned to the processInstance being executed by the tenant. The queue attribute of the executionContext for any taskInstance belonging to this processInstance will also be consistent. Therefore, the queue of a task is ultimately determined by the queue attribute of the runtime tenant if not explicitly specified.

Copy link
Member

@Radeity Radeity May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this logic is missing, I've debugged and find the queue of processInstance is null. Also in codes, I don't find where it's assigned by tenant's queue. Would like to help check in your local env?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. From my practice, the queue is assigned. This logic should be correctly implemented in the current version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird, do you test on branch dev?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My dolphin env is 3.1.7

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested the modification in this PR on branch dev?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will test it tomorrow, in my env I cp this pr to my 3.1.7.

Comment on lines 330 to 335
if (StringUtils.isEmpty(others) || !others.contains(FlinkConstants.FLINK_QUEUE_FOR_TARGETS)) {
String queue = flinkParameters.getQueue();
if (StringUtils.isNotEmpty(queue)) { // -Dyarn.application.queue=%s
args.add(String.format(FlinkConstants.FLINK_QUEUE_FOR_TARGETS, queue));
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some duplicated code. Can we clean up the logic and try to avoid introducing this new method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@ORuteMa ORuteMa changed the title [Bug][Plugin]Fix: Correct the way to determine the yarn queue in Flink CommandLine and SQL mode [bug][plugin]Fix: Correct the way to determine the yarn queue in Flink CommandLine and SQL mode May 30, 2023
@Radeity
Copy link
Member

Radeity commented May 31, 2023

Hi, @ORuteMa , any feedback?

@ORuteMa
Copy link
Contributor Author

ORuteMa commented Jun 1, 2023

Hi, @ORuteMa , any feedback?

Sorry I am busy these days, I will test it later.

@caishunfeng caishunfeng added bug Something isn't working 3.1.x for 3.1.x version labels Jun 1, 2023
@ORuteMa
Copy link
Contributor Author

ORuteMa commented Jun 5, 2023

@caishunfeng @zhongjiajie pls approve run ci, thanks.

@SbloodyS
Copy link
Member

SbloodyS commented Jun 5, 2023

@caishunfeng @zhongjiajie pls approve run ci, thanks.

Done.

@ORuteMa
Copy link
Contributor Author

ORuteMa commented Jun 5, 2023

@caishunfeng @zhongjiajie pls approve run ci, thanks.

Done.

I see some fail in this run, it may have no business to do with my pr. Is it a network problem or something else? Pls have a look.


private static void determinedYarnQueue(List<String> args, FlinkParameters flinkParameters,
FlinkDeployMode deployMode, String flinkVersion) {
switch (deployMode) {

Check warning

Code scanning / CodeQL

Missing enum case in switch

Switch statement does not have a case for [STANDALONE](1). Switch statement does not have a case for [LOCAL](2).
@zhongjiajie
Copy link
Member

restarted the failed E2E test

@sonarcloud
Copy link

sonarcloud bot commented Jun 8, 2023

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot E 1 Security Hotspot
Code Smell A 13 Code Smells

10.4% 10.4% Coverage
56.0% 56.0% Duplication

Copy link
Member

@zhongjiajie zhongjiajie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, overall do you have addition suggestion? @Radeity

Copy link
Member

@Radeity Radeity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@SbloodyS SbloodyS added this to the 3.1.8 milestone Jun 12, 2023
@SbloodyS SbloodyS merged commit de2cc0e into apache:dev Jun 12, 2023
zhuangchong pushed a commit that referenced this pull request Jun 26, 2023
…k CommandLine and SQL mode (#14237)

* Fix: Correct the way to determine the yarn queue in Flink CommandLine

* fix the yarn queue in sql mode && refine the code

* refine code

* remove unnecessary comment

* fix yarn queue properties

* remove redundant variable
zhongjiajie pushed a commit that referenced this pull request Jul 20, 2023
…k CommandLine and SQL mode (#14237)

* Fix: Correct the way to determine the yarn queue in Flink CommandLine

* fix the yarn queue in sql mode && refine the code

* refine code

* remove unnecessary comment

* fix yarn queue properties

* remove redundant variable

(cherry picked from commit de2cc0e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.1.x for 3.1.x version backend bug Something isn't working first time contributor First-time contributor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] [Flink] Yarn queue conf doesn't work in flink command line.
7 participants