-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configure envoy health check period form director #153
Configure envoy health check period form director #153
Conversation
...-tutorials/interactive_api/Director_Pytorch_Kvasir_UNET/director_folder/director_config.yaml
Show resolved
Hide resolved
|
||
shard_info['is_online']: True | ||
shard_info['is_experiment_running'] = is_experiment_running | ||
shard_info['valid_duration'] = valid_duration | ||
shard_info['last_updated'] = time.time() | ||
hc_period = self.settings.get('envoy_health_check_period', ENVOY_HEALTH_CHECK_PERIOD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have 2 variables here: hc_period
and valid_duration
and they always have the same relation y = 2*x.
So could we remove one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
valid_duration
- this variable is sent by the envoy and means how long the sent information can be considered valid
health_check_period
- the director sends this variable to the envoy to configure. It's possible to change its value during the life of the service and, because of this, valid_duration
depends on the preciously set value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was refactored
605f78c
to
344f68b
Compare
@@ -16,13 +16,23 @@ | |||
|
|||
logger = logging.getLogger(__name__) | |||
|
|||
ENVOY_HEALTH_CHECK_PERIOD = 60 # in seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have defaul director.yaml could we get this patam from it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This parameter can be removed from director.yaml by user. What is the default value in this case?
openfl/component/envoy/envoy.py
Outdated
) | ||
time.sleep(DEFAULT_TIMEOUT_IN_SECONDS / 2) | ||
time.sleep(timeout if timeout < new_timeout else new_timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use min()
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was removed
openfl/component/envoy/envoy.py
Outdated
new_timeout = self.director_client.send_health_check( | ||
collaborator_name=self.name, | ||
is_experiment_running=self.is_experiment_running, | ||
valid_duration=timeout * 2 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really required to change health check timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was refactored.
Configure envoy health check period form director.
Ability to set a envoy health check period (
envoy_health_check_period
) indirector_config.yaml
.