Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Enrich Statistics to Elasticsearch Input Plugin #15685

Closed
wp-perc opened this issue Jul 30, 2024 · 5 comments · Fixed by #15688
Closed

Add Enrich Statistics to Elasticsearch Input Plugin #15685

wp-perc opened this issue Jul 30, 2024 · 5 comments · Fixed by #15688
Assignees
Labels
feature request Requests for new plugin and for new features to existing plugins

Comments

@wp-perc
Copy link

wp-perc commented Jul 30, 2024

Use Case

I need to get statistics about how Elasticsearch Enrichment is performing. This can be done through the Enrich Stats API.
These statics can ne gathered at Cluster level, and are per each Coordinator Node.

Expected behavior

Here an example of API Call (which is pretty simple, since no arguments are requested), and an example of the answer.

GET /_enrich/_stats

{
  "executing_policies": [],
  "coordinator_stats": [
    {
      "node_id": "RWkDKDRu_aV1fISRA7PIkg",
      "queue_size": 0,
      "remote_requests_current": 0,
      "remote_requests_total": 101636700,
      "executed_searches_total": 102230925
    },
    {
      "node_id": "2BOvel8nrXRjmSMAMBSUp3",
      "queue_size": 0,
      "remote_requests_current": 0,
      "remote_requests_total": 242051423,
      "executed_searches_total": 242752071
    },
    {
      "node_id": "smkOUPQOK1pymt8MCoglZJ",
      "queue_size": 0,
      "remote_requests_current": 0,
      "remote_requests_total": 248009084,
      "executed_searches_total": 248735550
    },
    {
      "node_id": "g5EUAaS-6-z5w27OtGQeTI",
      "queue_size": 0,
      "remote_requests_current": 0,
      "remote_requests_total": 233693129,
      "executed_searches_total": 234476004
    }
  ],
  "cache_stats": [
    {
      "node_id": "RWkDKDRu_aV1fISRA7PIkg",
      "count": 2500,
      "hits": 6044497858,
      "misses": 102230925,
      "evictions": 92663663
    },
    {
      "node_id": "2BOvel8nrXRjmSMAMBSUp3",
      "count": 2500,
      "hits": 14640821136,
      "misses": 242752071,
      "evictions": 226826313
    },
    {
      "node_id": "smkOUPQOK1pymt8MCoglZJ",
      "count": 2500,
      "hits": 14145580115,
      "misses": 248735550,
      "evictions": 233860968
    },
    {
      "node_id": "g5EUAaS-6-z5w27OtGQeTI",
      "count": 2500,
      "hits": 11016000946,
      "misses": 234476004,
      "evictions": 217698127
    }
  ]
}

I am expecting to have statistics from coordinator_stats and cache_stats into two dedicated measures in InfluxDB. Maybe, node_id should be correctly linked to each node name.

Actual behavior

None of this is done

Additional info

Answer from Enrich stats API is well described at https://www.elastic.co/guide/en/elasticsearch/reference/current/enrich-stats-api.html#enrich-stats-api-response-body. Contents from executing_policies should be ignored.

@wp-perc wp-perc added the feature request Requests for new plugin and for new features to existing plugins label Jul 30, 2024
powersj added a commit to powersj/telegraf that referenced this issue Jul 30, 2024
@powersj
Copy link
Contributor

powersj commented Jul 30, 2024

@wp-perc,

Should this be captured only on the master node or all nodes?

I have put up #15688 which creates two new measurements to capture the above data, currently from every node. Let me know if this is what you expect and in 20-30mins after this message please download the artifacts and try them out.

Thanks!

@powersj powersj added the waiting for response waiting for response from contributor label Jul 30, 2024
@powersj powersj self-assigned this Jul 30, 2024
@wp-perc
Copy link
Author

wp-perc commented Jul 31, 2024

I think it can be captured on Master only: this API returns statistics for all coordinator nodes in the cluster, so getting them from the master is enough.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jul 31, 2024
@wp-perc
Copy link
Author

wp-perc commented Jul 31, 2024

I tested the build, but I get an error whenever I set enrich_stats to true:

Jul 31 11:59:05 perc-nepdevel.test telegraf[3822404]: 2024-07-31T09:59:05Z E! [inputs.elasticsearch] Error in plugin: elasticsearch: API responded with status-code 405, expected 200

Unfortunately, I m not able to obtain the URL that Telegraf is trying to query, so I looked into the code.
If I understand it right, I think there is a mismatch in the URL:

if err := e.gatherEnrichStats(s+"/_enrich/stats", acc); err != nil {

Here, telegraf calls /_enrich/stats, but the right endpoint is /_enrich/_stats.
Can you fix it?

@powersj
Copy link
Contributor

powersj commented Jul 31, 2024

Here, telegraf calls /_enrich/stats, but the right endpoint is /_enrich/_stats.
Can you fix it?

Thank you very much for trying this out. I have pushed an update and new artifacts will be available in 20-30mins. Can you give the new set a try please.

Thanks again!

@wp-perc
Copy link
Author

wp-perc commented Aug 1, 2024

I realized now I replied to the Pull Request.
For people that will come next, it should be useful to say:

now it works. Thanks for the development :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants