add invalid captcha and messageChannel sync status health monitoring #10029

etiennejouan · 2025-02-05T14:51:15Z

Context :
We want to implement some counters to monitor server health. First counters will track : messageChannel sync status during job execution and invalid captcha.

How :
Counters are stored in cache and grouped by one-minute windows. Controllers are created for each metric, aggregating counter over a five-minutes window.
Endpoints are public and will be queried by Prometheus.

closes twentyhq/core-team-issues#55

greptile-apps

PR Summary

This PR implements health monitoring for message channel sync status and invalid captcha attempts, with metrics exposed for Prometheus integration.

Added HealthCacheService with time-windowed counters (5-minute aggregation window) for tracking message channel sync jobs and invalid captcha attempts
Added new endpoints /healthz/message-channel-sync-job-by-status and /healthz/invalid-captcha in HealthController for Prometheus metrics
Integrated health monitoring in MessageChannelSyncStatusService to track sync status transitions (ONGOING, ACTIVE, FAILED_UNKNOWN, FAILED_INSUFFICIENT_PERMISSIONS)
Added EngineHealth namespace to CacheStorageNamespace enum with 10-minute TTL for health metrics storage

_{12 file(s) reviewed, 6 comment(s)}
_{Edit PR Review Bot Settings | Greptile}

packages/twenty-server/src/engine/core-modules/health/health.controller.ts

packages/twenty-server/src/engine/core-modules/health/constants/time-window.constant.ts

packages/twenty-server/src/engine/core-modules/health/health-cache.service.ts

ijreilly

Overall seems good to me, but I wonder if the counter is the right strategy: do we have a stable enough traffic to know how to juge absolute numbers? (absolute numbers of invalid captcha for instance)
What will prometheus do with these numbers ? Compare them?

packages/twenty-server/src/engine/core-modules/captcha/captcha.guard.ts

packages/twenty-server/src/engine/core-modules/health/constants/time-window.constant.ts

packages/twenty-server/src/engine/core-modules/health/health.controller.ts

ijreilly

great !! looks cool

FelixMalfait · 2025-02-06T20:59:50Z

This is great! We can merge like this but FYI we probably could have done something more efficient if it was more tightly coupled with Redis (we have an abstraction layer but decided to get rid of other implementations so we might as well couple it now...).

We don't need to optimize for high throughput scenario but I think if we had a very high volume then the number here would be wrong (you do a read then a write so since Node runs on multiple threads you could have two concurrent read/write operations on the same number).

Redis reference for increments:
https://www.tutorialspoint.com/redis/hashes_hincrby.htm
(not concurrent: https://stackoverflow.com/questions/38954590/is-redis-hincrby-atomic)

And with a different implementation we could have used sorted sets with each timestamp as an entry
https://www.w3resource.com/redis/redis-zrangebyscore-key-min-max.php
([https://www.w3resource.com/redis/redis-zremrangebyscore-key-min-max.php to cleanup automatically)
This would give more flexibility for charting but has other downsides.

In any case, it doesn't really matter because we'd still get the good order of magnitude even if there was a concurrency issue! Great work!!! We should definitely display all those metrics in the Server Admin Panel

etiennejouan added 2 commits February 5, 2025 15:19

add message channel sync status health counter

f328f92

add invalid captcha health

bb2fa13

greptile-apps bot reviewed Feb 5, 2025

View reviewed changes

fix

95b574a

etiennejouan added the -PR: awaiting review label Feb 5, 2025

ijreilly reviewed Feb 5, 2025

View reviewed changes

fix after review

bed0005

ijreilly approved these changes Feb 5, 2025

View reviewed changes

etiennejouan added 2 commits February 5, 2025 18:14

Merge branch 'main' into ej/55

3ba12c2

Merge branch 'main' into ej/55

7e44cd0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add invalid captcha and messageChannel sync status health monitoring #10029

add invalid captcha and messageChannel sync status health monitoring #10029

etiennejouan commented Feb 5, 2025 •

edited

Loading

greptile-apps bot left a comment

ijreilly left a comment

ijreilly left a comment

FelixMalfait commented Feb 6, 2025

add invalid captcha and messageChannel sync status health monitoring #10029

Are you sure you want to change the base?

add invalid captcha and messageChannel sync status health monitoring #10029

Conversation

etiennejouan commented Feb 5, 2025 • edited Loading

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

ijreilly left a comment

Choose a reason for hiding this comment

ijreilly left a comment

Choose a reason for hiding this comment

FelixMalfait commented Feb 6, 2025

etiennejouan commented Feb 5, 2025 •

edited

Loading