Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement][Metrics] Apply micrometer naming convention to metrics #10477

Merged
merged 4 commits into from
Jun 21, 2022

Conversation

EricGao888
Copy link
Member

@EricGao888 EricGao888 commented Jun 16, 2022

Purpose of the pull request

Brief change log

  • Apply naming conventions to metrics. Metrics names in Grafana demo dashboards are updated correspondingly.
  • Add explanations about metrics naming convention and mapping in docs.
  • Add Chinese version docs.

Verify this pull request

Verified by manual test.

@EricGao888 EricGao888 marked this pull request as draft June 16, 2022 09:48
@SbloodyS SbloodyS requested a review from ruanwenjun June 16, 2022 09:50
@EricGao888
Copy link
Member Author

Could u plz take a look when you have time? @ruanwenjun Thx~

.register(Metrics.globalRegistry);

private static final Counter WORKER_SUBMIT_QUEUE_IS_FULL_COUNTER =
Counter.builder("dolphinscheduler_worker_submit_queue_is_full_count")
.description("worker task submit queue is full count")
Counter.builder("ds.system.worker.full.submit.queue.count")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about remove the system?

Suggested change
Counter.builder("ds.system.worker.full.submit.queue.count")
Counter.builder("ds.worker.full.submit.queue.count")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will remove system, since it might cause some ambiguities with jvm / network level stuff.

@SbloodyS SbloodyS added improvement make more easy to user or prompt friendly backend metrics labels Jun 19, 2022
@codecov-commenter
Copy link

Codecov Report

Merging #10477 (a45b9e6) into dev (2c227ab) will decrease coverage by 0.00%.
The diff coverage is 43.33%.

@@             Coverage Diff              @@
##                dev   #10477      +/-   ##
============================================
- Coverage     40.86%   40.85%   -0.01%     
+ Complexity     4849     4848       -1     
============================================
  Files           886      886              
  Lines         36009    36032      +23     
  Branches       3994     3998       +4     
============================================
+ Hits          14716    14722       +6     
- Misses        19835    19850      +15     
- Partials       1458     1460       +2     
Impacted Files Coverage Δ
...ler/server/master/metrics/MasterServerMetrics.java 0.00% <0.00%> (ø)
...nscheduler/service/process/ProcessServiceImpl.java 30.76% <ø> (ø)
...inscheduler/server/worker/metrics/TaskMetrics.java 0.00% <0.00%> (ø)
...ler/server/worker/metrics/WorkerServerMetrics.java 0.00% <0.00%> (ø)
.../server/worker/processor/TaskExecuteProcessor.java 0.00% <ø> (ø)
...inscheduler/plugin/task/zeppelin/ZeppelinTask.java 48.68% <34.37%> (-7.92%) ⬇️
.../server/master/metrics/ProcessInstanceMetrics.java 62.50% <85.71%> (ø)
...inscheduler/server/master/metrics/TaskMetrics.java 69.44% <100.00%> (ø)
...duler/plugin/task/zeppelin/ZeppelinParameters.java 84.61% <100.00%> (+7.69%) ⬆️
...er/master/dispatch/host/assign/RandomSelector.java 77.77% <0.00%> (-5.56%) ⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 704e633...a45b9e6. Read the comment docs.

Copy link
Member

@kezhenxu94 kezhenxu94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a way to unify the formats of customize metrics as well as default metrics (like jvm_gc_pause_seconds_max, etc.)? Otherwise the defaults still don't follow the micrometer naming convention.

@EricGao888
Copy link
Member Author

EricGao888 commented Jun 20, 2022

Do we have a way to unify the formats of customize metrics as well as default metrics (like jvm_gc_pause_seconds_max, etc.)? Otherwise the defaults still don't follow the micrometer naming convention.

@kezhenxu94 Thanks for pointing this out. Actually those default metrics DO follow the naming convention. If we check http://ip:port/actuator/metrics, we could get the list of metrics' names before mapping. We will see those default metrics as
image. The reason why in this PR it looks different is that this PR is still a draft and WIP. I haven't updated the docs yet. Sorry for the confusion.

@kezhenxu94
Copy link
Member

Do we have a way to unify the formats of customize metrics as well as default metrics (like jvm_gc_pause_seconds_max, etc.)? Otherwise the defaults still don't follow the micrometer naming convention.

@kezhenxu94 Thanks for pointing this out. Actually those default metrics DO follow the naming convention. If we check http://ip:port/actuator/metrics, we could get the list of metrics' names before mapping. We will see those default metrics as image. The reason why in this PR it looks different is that this PR is still a draft and WIP. I haven't updated the docs yet. Sorry for the confusion.

Nice if that's the case! LGTM

@EricGao888
Copy link
Member Author

Do we have a way to unify the formats of customize metrics as well as default metrics (like jvm_gc_pause_seconds_max, etc.)? Otherwise the defaults still don't follow the micrometer naming convention.

@kezhenxu94 Thanks for pointing this out. Actually those default metrics DO follow the naming convention. If we check http://ip:port/actuator/metrics, we could get the list of metrics' names before mapping. We will see those default metrics as image. The reason why in this PR it looks different is that this PR is still a draft and WIP. I haven't updated the docs yet. Sorry for the confusion.

Nice if that's the case! LGTM

@kezhenxu94 Somehow I just couldn't fall asleep last night. So I got up very early this morning, wrote some code and docs, committed and pushed them to find some peace and went to bed again. Haven't finished it yet, lol. I will complete this PR later today. 🤣

@EricGao888 EricGao888 marked this pull request as ready for review June 20, 2022 13:23
at `dolphinscheduler-meter/resources/grafana`, you can directly import these dashboards to grafana.

If you want to try at docker, you can use the following command to start the prometheus with grafana:
- We enable Apache DolphinScheduler export metrics in `standalone` mode to help users get hands dirty easily.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- We enable Apache DolphinScheduler export metrics in `standalone` mode to help users get hands dirty easily.
- We enable Apache DolphinScheduler to export metrics in `standalone` mode to help users get hands dirty easily.

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

61.8% 61.8% Coverage
0.0% 0.0% Duplication

Copy link
Member

@kezhenxu94 kezhenxu94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EricGao888 EricGao888 requested a review from ruanwenjun June 21, 2022 06:07
Copy link
Member

@ruanwenjun ruanwenjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@ruanwenjun ruanwenjun merged commit cc06eaa into apache:dev Jun 21, 2022
hstdream pushed a commit to hstdream/dolphinscheduler that referenced this pull request Jun 23, 2022
…pache#10477)

* Apply micrometer naming convention to worker metrics
* Apply micrometer naming convention all current metrics
* Fix remaining metrics names, update English docs and add Chinese docs
* Fix metrics names in grafana-demo dashboards
@zhongjiajie zhongjiajie added this to the 3.1.0-alpha milestone Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend improvement make more easy to user or prompt friendly metrics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement][Metrics] Apply micrometer naming convention to metrics
6 participants