Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Prometheus instrumentation for ingest-file workers #550

Merged
merged 9 commits into from
Nov 22, 2023

Conversation

tillprochaska
Copy link
Contributor

@tillprochaska tillprochaska commented Nov 13, 2023

This PR adds instrumentation for ingest-file workers:

In order to test these changes with a local Aleph instance:

  • Build the ingest-file image using docker build -t ghcr.io/alephdata/ingest-file:metrics ..

  • In you Aleph docker-compose.dev.yml, set the ingest-file image tag to metrics.

  • In you Aleph docker-compose.dev.yml, set the PROMETHEUS_ENABLED=true environment variable for the ingest-file service.

  • Run docker compose -f docker-compose.dev.yml up to apply the changes.

  • Run docker compose exec ingest-file curl http://localhost:9090/metrics.

@tillprochaska tillprochaska mentioned this pull request Nov 13, 2023
50 tasks
* I’m not entirely sure, but there might be cases where `file_path` is a directory, i.e. `file_size` would be `None` in that case.
* The code didn’t cover the case where the entity already has a `fileSize` property. In this case, `file_size` would be `None`, even though `file_path` is a file. I’m not entirely sure when this happens, probably when reingesting an existing file.

Both cases were catched by tests :)
@tillprochaska tillprochaska marked this pull request as ready for review November 14, 2023 18:31
@tillprochaska tillprochaska requested review from catileptic and stchris and removed request for catileptic November 14, 2023 18:31
Even though it is considered an anti-pattern to add a prefix with the name of the software or component to metrics (according to the official Prometheus documentation), I have decided to add a prefix. I’ve found that this makes it much easier to find relevant metrics. The main disadvantage of per-component prefixes queries become slightly more complex if you want to query the same metric (e.g. HTTP request duration) across multiple components. This isn’t super important in our case though, so I think the trade-off is acceptable.
ingestors/manager.py Outdated Show resolved Hide resolved
Copy link
Contributor

@stchris stchris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comment to make it a bit more concise, otherwise it looks great

ingestors/manager.py Outdated Show resolved Hide resolved
Copy link
Contributor

@stchris stchris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!

@tillprochaska tillprochaska added this pull request to the merge queue Nov 22, 2023
Merged via the queue into main with commit 768978c Nov 22, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants