fix: health-checks for Elasticsearch #13322

ebuildy · 2023-04-23T17:33:24Z

argocd application go to "degraged" state as soon as elasticsearch cluster health go to yellow state. Which is wrong, yellow state should not be considered as a broken state:

"when a new index is created, the cluster will go to yellow state, because replicas are not ready"

This can flood notifications with wrong alerts, because elasticsearch can create / move indices all the time (we got more than 10 wrong alerts per day).

To me, this is a bug on elasticsearch side (I mean cluster health check should wait a little time before changing color), but according discussion with elastic team at https://discuss.elastic.co/t/cluster-health-wrong-yellow-spikes-because-new-index/330124/6, this is normal, and yellow "spike" should not be considered as error.

The current argocd health check implementation is reporting error as soon as elasticsearch cluster health changes to yellow, I edit it to consider yellow state as "progressing" app state.

A better solution will be to add a kind a buffer, in order to give some time before the argocd application degrades.

Relate:

feat: Add health-checks for Elasticsearch #6052 #6052

codecov · 2023-04-23T22:53:12Z

Codecov Report

Patch coverage has no change and project coverage change: -0.01 ⚠️

Comparison is base (edf9916) 49.01% compared to head (d86ae69) 49.01%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #13322      +/-   ##
==========================================
- Coverage   49.01%   49.01%   -0.01%     
==========================================
  Files         247      247              
  Lines       42685    42685              
==========================================
- Hits        20924    20921       -3     
- Misses      19645    19647       +2     
- Partials     2116     2117       +1

see 2 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Signed-off-by: Thomas Decaux <ebuildy@gmail.com> Signed-off-by: ebuildy <ebuildy@gmail.com>

Signed-off-by: ebuildy <ebuildy@gmail.com>

drpaneas · 2023-04-29T22:16:00Z

I agree, the yellow state in an Elasticsearch cluster should not be considered an error state.

In Elasticsearch, cluster health is represented by three states: green, yellow, and red. A green state means that all primary and replica shards are active and allocated in the cluster. A yellow state means that all primary shards are active and allocated, but some replica shards are either missing or initializing. A red state means that one or more primary shards are missing or failed to allocate, indicating a serious problem.

While a yellow state indicates that there may be some issues with replica shard allocation, it does not necessarily mean that there is a critical error in the cluster. Depending on your specific use case and requirements, a yellow state may be acceptable and normal.

However, if the yellow state persists for an extended period of time or is accompanied by other issues, it may be worth investigating further to ensure the stability and reliability of the Elasticsearch cluster.

That said I am fine with this PR.

/lgtm

ashutosh16

LGTM

crenshaw-dev

Thanks, @ebuildy!

* fix: health-checks for Elasticsearch Signed-off-by: Thomas Decaux <ebuildy@gmail.com> Signed-off-by: ebuildy <ebuildy@gmail.com> * fix tests Signed-off-by: ebuildy <ebuildy@gmail.com> --------- Signed-off-by: Thomas Decaux <ebuildy@gmail.com> Signed-off-by: ebuildy <ebuildy@gmail.com>

ebuildy added 2 commits April 23, 2023 19:20

fix: health-checks for Elasticsearch

7003fee

Signed-off-by: Thomas Decaux <ebuildy@gmail.com> Signed-off-by: ebuildy <ebuildy@gmail.com>

fix tests

d86ae69

Signed-off-by: ebuildy <ebuildy@gmail.com>

ebuildy force-pushed the patch-4 branch from 7453a94 to d86ae69 Compare April 23, 2023 23:21

ebuildy marked this pull request as ready for review April 24, 2023 00:19

ashutosh16 approved these changes May 3, 2023

View reviewed changes

crenshaw-dev approved these changes May 28, 2023

View reviewed changes

crenshaw-dev merged commit d01d67b into argoproj:master May 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: health-checks for Elasticsearch #13322

fix: health-checks for Elasticsearch #13322

ebuildy commented Apr 23, 2023 •

edited

Loading

codecov bot commented Apr 23, 2023 •

edited

Loading

drpaneas commented Apr 29, 2023

ashutosh16 left a comment

crenshaw-dev left a comment

fix: health-checks for Elasticsearch #13322

fix: health-checks for Elasticsearch #13322

Conversation

ebuildy commented Apr 23, 2023 • edited Loading

codecov bot commented Apr 23, 2023 • edited Loading

Codecov Report

drpaneas commented Apr 29, 2023

ashutosh16 left a comment

Choose a reason for hiding this comment

crenshaw-dev left a comment

Choose a reason for hiding this comment

ebuildy commented Apr 23, 2023 •

edited

Loading

codecov bot commented Apr 23, 2023 •

edited

Loading