Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): Add Prometheus metrics for n8n events and API invocations (experimental) #5177

Merged
merged 3 commits into from
Jan 19, 2023

Conversation

flipswitchingmonkey
Copy link
Contributor

@flipswitchingmonkey flipswitchingmonkey commented Jan 17, 2023

This PR adds more configurable metrics in Prometheus exposition format to the /metrics endpoint.

Which metric groups and labels to expose can be configured via environment variables (to some extent).

Below example output of the /metrics endpoint was generated with the following env vars configured:

Environment variables:

N8N_METRICS=true
N8N_METRICS_PREFIX=n8n_
N8N_METRICS_INCLUDE_DEFAULT_METRICS=true
N8N_METRICS_INCLUDE_WORKFLOW_ID_LABEL=true
N8N_METRICS_INCLUDE_NODE_TYPE_LABEL=true
N8N_METRICS_INCLUDE_CREDENTIAL_TYPE_LABEL=true
N8N_METRICS_INCLUDE_API_ENDPOINTS=true
N8N_METRICS_INCLUDE_API_PATH_LABEL=true
N8N_METRICS_INCLUDE_API_METHOD_LABEL=true
N8N_METRICS_INCLUDE_API_STATUS_CODE_LABEL=true

Example output of GET /metrics

# HELP process_cpu_user_seconds_total Total user CPU time spent in seconds.
# TYPE process_cpu_user_seconds_total counter
process_cpu_user_seconds_total 1.611351

# HELP process_cpu_system_seconds_total Total system CPU time spent in seconds.
# TYPE process_cpu_system_seconds_total counter
process_cpu_system_seconds_total 0.261071

# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 1.872422

# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1674121593

# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 257355776

# HELP nodejs_eventloop_lag_seconds Lag of event loop in seconds.
# TYPE nodejs_eventloop_lag_seconds gauge
nodejs_eventloop_lag_seconds 0.005210362

# HELP nodejs_eventloop_lag_min_seconds The minimum recorded event loop delay.
# TYPE nodejs_eventloop_lag_min_seconds gauge
nodejs_eventloop_lag_min_seconds 0.00868352

# HELP nodejs_eventloop_lag_max_seconds The maximum recorded event loop delay.
# TYPE nodejs_eventloop_lag_max_seconds gauge
nodejs_eventloop_lag_max_seconds 0.460849151

# HELP nodejs_eventloop_lag_mean_seconds The mean of the recorded event loop delays.
# TYPE nodejs_eventloop_lag_mean_seconds gauge
nodejs_eventloop_lag_mean_seconds 0.011563190651685392

# HELP nodejs_eventloop_lag_stddev_seconds The standard deviation of the recorded event loop delays.
# TYPE nodejs_eventloop_lag_stddev_seconds gauge
nodejs_eventloop_lag_stddev_seconds 0.006139661652063662

# HELP nodejs_eventloop_lag_p50_seconds The 50th percentile of the recorded event loop delays.
# TYPE nodejs_eventloop_lag_p50_seconds gauge
nodejs_eventloop_lag_p50_seconds 0.012001279

# HELP nodejs_eventloop_lag_p90_seconds The 90th percentile of the recorded event loop delays.
# TYPE nodejs_eventloop_lag_p90_seconds gauge
nodejs_eventloop_lag_p90_seconds 0.012148735

# HELP nodejs_eventloop_lag_p99_seconds The 99th percentile of the recorded event loop delays.
# TYPE nodejs_eventloop_lag_p99_seconds gauge
nodejs_eventloop_lag_p99_seconds 0.012197887

# HELP nodejs_active_handles Number of active libuv handles grouped by handle type. Every handle type is C++ class name.
# TYPE nodejs_active_handles gauge
nodejs_active_handles{type="WriteStream"} 2
nodejs_active_handles{type="ReadStream"} 1
nodejs_active_handles{type="Server"} 1
nodejs_active_handles{type="Socket"} 2

# HELP nodejs_active_handles_total Total number of active handles.
# TYPE nodejs_active_handles_total gauge
nodejs_active_handles_total 6

# HELP nodejs_active_requests Number of active libuv requests grouped by request type. Every request type is C++ class name.
# TYPE nodejs_active_requests gauge

# HELP nodejs_active_requests_total Total number of active requests.
# TYPE nodejs_active_requests_total gauge
nodejs_active_requests_total 0

# HELP nodejs_heap_size_total_bytes Process heap size from Node.js in bytes.
# TYPE nodejs_heap_size_total_bytes gauge
nodejs_heap_size_total_bytes 137408512

# HELP nodejs_heap_size_used_bytes Process heap size used from Node.js in bytes.
# TYPE nodejs_heap_size_used_bytes gauge
nodejs_heap_size_used_bytes 129931048

# HELP nodejs_external_memory_bytes Node.js external memory size in bytes.
# TYPE nodejs_external_memory_bytes gauge
nodejs_external_memory_bytes 2207464

# HELP nodejs_heap_space_size_total_bytes Process heap space size total from Node.js in bytes.
# TYPE nodejs_heap_space_size_total_bytes gauge
nodejs_heap_space_size_total_bytes{space="read_only"} 176128
nodejs_heap_space_size_total_bytes{space="old"} 106844160
nodejs_heap_space_size_total_bytes{space="code"} 3776512
nodejs_heap_space_size_total_bytes{space="map"} 4202496
nodejs_heap_space_size_total_bytes{space="large_object"} 20779008
nodejs_heap_space_size_total_bytes{space="code_large_object"} 581632
nodejs_heap_space_size_total_bytes{space="new_large_object"} 0
nodejs_heap_space_size_total_bytes{space="new"} 1048576

# HELP nodejs_heap_space_size_used_bytes Process heap space size used from Node.js in bytes.
# TYPE nodejs_heap_space_size_used_bytes gauge
nodejs_heap_space_size_used_bytes{space="read_only"} 170944
nodejs_heap_space_size_used_bytes{space="old"} 101076768
nodejs_heap_space_size_used_bytes{space="code"} 3421824
nodejs_heap_space_size_used_bytes{space="map"} 3306456
nodejs_heap_space_size_used_bytes{space="large_object"} 20534376
nodejs_heap_space_size_used_bytes{space="code_large_object"} 542048
nodejs_heap_space_size_used_bytes{space="new_large_object"} 0
nodejs_heap_space_size_used_bytes{space="new"} 887696

# HELP nodejs_heap_space_size_available_bytes Process heap space size available from Node.js in bytes.
# TYPE nodejs_heap_space_size_available_bytes gauge
nodejs_heap_space_size_available_bytes{space="read_only"} 0
nodejs_heap_space_size_available_bytes{space="old"} 3857736
nodejs_heap_space_size_available_bytes{space="code"} 108928
nodejs_heap_space_size_available_bytes{space="map"} 821552
nodejs_heap_space_size_available_bytes{space="large_object"} 0
nodejs_heap_space_size_available_bytes{space="code_large_object"} 0
nodejs_heap_space_size_available_bytes{space="new_large_object"} 1031072
nodejs_heap_space_size_available_bytes{space="new"} 143376

# HELP nodejs_version_info Node.js version info.
# TYPE nodejs_version_info gauge
nodejs_version_info{version="v16.15.0",major="16",minor="15",patch="0"} 1

# HELP nodejs_gc_duration_seconds Garbage collection duration by kind, one of major, minor, incremental or weakcb.
# TYPE nodejs_gc_duration_seconds histogram
nodejs_gc_duration_seconds_bucket{le="0.001",kind="minor"} 8
nodejs_gc_duration_seconds_bucket{le="0.01",kind="minor"} 10
nodejs_gc_duration_seconds_bucket{le="0.1",kind="minor"} 10
nodejs_gc_duration_seconds_bucket{le="1",kind="minor"} 10
nodejs_gc_duration_seconds_bucket{le="2",kind="minor"} 10
nodejs_gc_duration_seconds_bucket{le="5",kind="minor"} 10
nodejs_gc_duration_seconds_bucket{le="+Inf",kind="minor"} 10
nodejs_gc_duration_seconds_sum{kind="minor"} 0.011158959000371397
nodejs_gc_duration_seconds_count{kind="minor"} 10
nodejs_gc_duration_seconds_bucket{le="0.001",kind="incremental"} 5
nodejs_gc_duration_seconds_bucket{le="0.01",kind="incremental"} 6
nodejs_gc_duration_seconds_bucket{le="0.1",kind="incremental"} 6
nodejs_gc_duration_seconds_bucket{le="1",kind="incremental"} 6
nodejs_gc_duration_seconds_bucket{le="2",kind="incremental"} 6
nodejs_gc_duration_seconds_bucket{le="5",kind="incremental"} 6
nodejs_gc_duration_seconds_bucket{le="+Inf",kind="incremental"} 6
nodejs_gc_duration_seconds_sum{kind="incremental"} 0.002568940999917686
nodejs_gc_duration_seconds_count{kind="incremental"} 6
nodejs_gc_duration_seconds_bucket{le="0.001",kind="major"} 0
nodejs_gc_duration_seconds_bucket{le="0.01",kind="major"} 1
nodejs_gc_duration_seconds_bucket{le="0.1",kind="major"} 3
nodejs_gc_duration_seconds_bucket{le="1",kind="major"} 3
nodejs_gc_duration_seconds_bucket{le="2",kind="major"} 3
nodejs_gc_duration_seconds_bucket{le="5",kind="major"} 3
nodejs_gc_duration_seconds_bucket{le="+Inf",kind="major"} 3
nodejs_gc_duration_seconds_sum{kind="major"} 0.03339293400011956
nodejs_gc_duration_seconds_count{kind="major"} 3
nodejs_gc_duration_seconds_bucket{le="0.001",kind="weakcb"} 4
nodejs_gc_duration_seconds_bucket{le="0.01",kind="weakcb"} 4
nodejs_gc_duration_seconds_bucket{le="0.1",kind="weakcb"} 4
nodejs_gc_duration_seconds_bucket{le="1",kind="weakcb"} 4
nodejs_gc_duration_seconds_bucket{le="2",kind="weakcb"} 4
nodejs_gc_duration_seconds_bucket{le="5",kind="weakcb"} 4
nodejs_gc_duration_seconds_bucket{le="+Inf",kind="weakcb"} 4
nodejs_gc_duration_seconds_sum{kind="weakcb"} 0.000053920999867841604
nodejs_gc_duration_seconds_count{kind="weakcb"} 4

# HELP n8n_version_info n8n version info.
# TYPE n8n_version_info gauge
n8n_version_info{version="v0.211.1",major="0",minor="211",patch="1"} 1

# HELP http_request_duration_seconds duration histogram of http responses labeled with: status_code, method, path
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.003",status_code="401",method="POST",path="/api/v1/audit"} 0
http_request_duration_seconds_bucket{le="0.03",status_code="401",method="POST",path="/api/v1/audit"} 1
http_request_duration_seconds_bucket{le="0.1",status_code="401",method="POST",path="/api/v1/audit"} 1
http_request_duration_seconds_bucket{le="0.3",status_code="401",method="POST",path="/api/v1/audit"} 1
http_request_duration_seconds_bucket{le="1.5",status_code="401",method="POST",path="/api/v1/audit"} 1
http_request_duration_seconds_bucket{le="10",status_code="401",method="POST",path="/api/v1/audit"} 1
http_request_duration_seconds_bucket{le="+Inf",status_code="401",method="POST",path="/api/v1/audit"} 1
http_request_duration_seconds_sum{status_code="401",method="POST",path="/api/v1/audit"} 0.009538856
http_request_duration_seconds_count{status_code="401",method="POST",path="/api/v1/audit"} 1
http_request_duration_seconds_bucket{le="0.003",status_code="401",method="POST",path="/api/v1/workflows"} 1
http_request_duration_seconds_bucket{le="0.03",status_code="401",method="POST",path="/api/v1/workflows"} 1
http_request_duration_seconds_bucket{le="0.1",status_code="401",method="POST",path="/api/v1/workflows"} 1
http_request_duration_seconds_bucket{le="0.3",status_code="401",method="POST",path="/api/v1/workflows"} 1
http_request_duration_seconds_bucket{le="1.5",status_code="401",method="POST",path="/api/v1/workflows"} 1
http_request_duration_seconds_bucket{le="10",status_code="401",method="POST",path="/api/v1/workflows"} 1
http_request_duration_seconds_bucket{le="+Inf",status_code="401",method="POST",path="/api/v1/workflows"} 1
http_request_duration_seconds_sum{status_code="401",method="POST",path="/api/v1/workflows"} 0.001645051
http_request_duration_seconds_count{status_code="401",method="POST",path="/api/v1/workflows"} 1
http_request_duration_seconds_bucket{le="0.003",status_code="200",method="POST",path="/rest/workflows/run"} 0
http_request_duration_seconds_bucket{le="0.03",status_code="200",method="POST",path="/rest/workflows/run"} 1
http_request_duration_seconds_bucket{le="0.1",status_code="200",method="POST",path="/rest/workflows/run"} 2
http_request_duration_seconds_bucket{le="0.3",status_code="200",method="POST",path="/rest/workflows/run"} 2
http_request_duration_seconds_bucket{le="1.5",status_code="200",method="POST",path="/rest/workflows/run"} 2
http_request_duration_seconds_bucket{le="10",status_code="200",method="POST",path="/rest/workflows/run"} 2
http_request_duration_seconds_bucket{le="+Inf",status_code="200",method="POST",path="/rest/workflows/run"} 2
http_request_duration_seconds_sum{status_code="200",method="POST",path="/rest/workflows/run"} 0.071720752
http_request_duration_seconds_count{status_code="200",method="POST",path="/rest/workflows/run"} 2

# HELP n8n_workflow_started_total Total number of n8n.workflow.started events.
# TYPE n8n_workflow_started_total counter
n8n_workflow_started_total{workflow_id="1"} 2

# HELP n8n_node_started_total Total number of n8n.node.started events.
# TYPE n8n_node_started_total counter
n8n_node_started_total{node_type="base_start"} 2
n8n_node_started_total{node_type="base_set"} 2
n8n_node_started_total{node_type="base_code"} 2

# HELP n8n_node_finished_total Total number of n8n.node.finished events.
# TYPE n8n_node_finished_total counter
n8n_node_finished_total{node_type="base_start"} 2
n8n_node_finished_total{node_type="base_set"} 2
n8n_node_finished_total{node_type="base_code"} 2

# HELP n8n_workflow_success_total Total number of n8n.workflow.success events.
# TYPE n8n_workflow_success_total counter
n8n_workflow_success_total{workflow_id="1"} 2

@n8n-assistant n8n-assistant bot added core Enhancement outside /nodes-base and /editor-ui n8n team Authored by the n8n team labels Jan 17, 2023
@csuermann csuermann changed the title feat(core): Creates Prometheus metric counters from events feat(core): Create Prometheus metric counters from events Jan 17, 2023
* refactor(core): Add Prometheus labels to relevant metrics

* feat(core): Add more Prometheus metrics (experimental)
csuermann
csuermann previously approved these changes Jan 19, 2023
packages/cli/src/config/schema.ts Outdated Show resolved Hide resolved
packages/cli/src/config/schema.ts Outdated Show resolved Hide resolved
packages/cli/src/config/schema.ts Outdated Show resolved Hide resolved
@csuermann csuermann changed the title feat(core): Create Prometheus metric counters from events feat(core): Add Prometheus metrics for n8n events and api invocations (experimental) Jan 19, 2023
@csuermann csuermann merged commit 9b032d6 into master Jan 19, 2023
@csuermann csuermann deleted the ENG-25-adds-Prometheus-event-counter branch January 19, 2023 11:11
@n8n-assistant n8n-assistant bot added the Upcoming Release Will be part of the upcoming release label Jan 19, 2023
@csuermann csuermann changed the title feat(core): Add Prometheus metrics for n8n events and api invocations (experimental) feat(core): Add Prometheus metrics for n8n events and API invocations (experimental) Jan 19, 2023
@janober
Copy link
Member

janober commented Jan 19, 2023

Got released with n8n@0.212.0

@janober janober removed the Upcoming Release Will be part of the upcoming release label Jan 19, 2023
@romubaronvp
Copy link

Hey 👋
Small typo, the following variable is set two times in "Environment variables":
N8N_METRICS_INCLUDE_API_METHOD_LABEL=true

MiloradFilipovic added a commit that referenced this pull request Jan 23, 2023
* master:
  fix(editor): Making parameter input components label configurable (#5195)
  feat(Google Analytics Node): Overhaul for google analytics node
  fix(Linear Node): Fix issue with single item not being returned (#5193)
  refactor: Update Notion nodes to remove beta from name (#4838)
  refactor(editor): Decouple REST calls from views (no-changelog) (#5202)
  test: Skip some syslog tests (no-changelog) (#5206)
  fix(Notion (Beta) Node): Fix create database page fails if relation param is empty/undefined (#5182)
  fix(core): Fix url in error handelling for the error Trigger (#5201)
  📚 Update CHANGELOG.md and main package.json to 0.212.0
  🔖 Release n8n@0.212.0
  ⬆️ Set n8n-editor-ui@0.178.0 and n8n-nodes-base@0.210.0 on n8n
  🔖 Release n8n-editor-ui@0.178.0
  🔖 Release n8n-nodes-base@0.210.0
  fix(core): Revert rule @typescript-eslint/prefer-nullish-coalescing
  feat(core): Add Prometheus metrics for n8n events and api invocations (experimental) (#5177)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Enhancement outside /nodes-base and /editor-ui n8n team Authored by the n8n team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants