Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

♻️ Fix compressor readiness shutdown_duration / Fix cassandra … #376

Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions charts/vald/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1019,7 +1019,10 @@ compressor:
healths:
liveness:
enabled: false
readiness: {}
readiness:
server:
http:
shutdown_duration: 1m
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think readiness should shutdown immediately. this setting increase duration until disconnect from kubernetes service dns.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please describe more about why we need 1minutes duration for shutting down the readiness server?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. which field affects to make compressor pods alive longer without increasing duration until disconnect?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, I see. readiness should shutdown immediately so it should be 0s and for liveness, it affects pod to keep them alive.
but I know, we don't use liveness probe for agent and compressor, hmm.... 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ha, okay understood.
we have now a backup strategy in compressor post-stop, it may be okay to be terminated suddenly caused by unexpected fails of liveness servers. 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, maybe we need another phase for shutting down the process in internal server

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we can set post pre process for each server, It would be useful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see. for now, i'm going to enable liveness and set readiness shutdown_duration to be zero. thanks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sound good to me.
please set liveness shutdown duration over 1minutes.
and we need to think about pod disruption budget

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revised. please check it 😄

metrics:
pprof: {}
prometheus: {}
Expand Down Expand Up @@ -1312,7 +1315,7 @@ backupManager:
# backupManager.cassandra.config.consistency -- consistency type
consistency: quorum
# backupManager.cassandra.config.serial_consistency -- read consistency type
serial_consistency: local_serial
serial_consistency: localserial
# backupManager.cassandra.config.username -- cassandra username
username: root
# backupManager.cassandra.config.password -- cassandra password
Expand Down Expand Up @@ -1831,7 +1834,7 @@ meta:
# meta.cassandra.config.consistency -- consistency type
consistency: quorum
# meta.cassandra.config.serial_consistency -- read consistency type
serial_consistency: local_serial
serial_consistency: localserial
# meta.cassandra.config.username -- cassandra username
username: root
# meta.cassandra.config.password -- cassandra password
Expand Down