Yet another review on hooks #7645

jsirex · 2020-04-27T20:58:35Z

Yet another review on hooks

Because of new supervisor behavior here is what we have:

If config/* change - reconfigure hook is triggered
If hooks/* change - unconditional uncontrollable restart is triggered.

Possible way to workaround this is extract all moving parts in config/ and detect these changes (#7644) in reconfigure hook, then trigger restart if necessary.

I tried to gather all hooks and its current state in the following table:

Hook name	If failed then	If absent then
`init`	No docs
`run`	No docs	No docs
`post-run`	Retry post-run
`reconfigure`	No docs	restart
`post-stop`	No docs
`file-updated`	No docs
`suitability`	No docs	No docs
---------------------------	----------------	----------------
New hooks we likely need
---------------------------	----------------	----------------
`stop` - missing	term-wait-kill	term-wait-kill
`upgrade` - missing	restart	restart
`downgrade` - missing	restart	restart

Suitability

Ambiguous Docs leader down is not a leader service down, but the leader supervisor down

Stop

When service asked to stop use this hook instead of kill. Anything printed to stdout resets timeout counter.
This is so obvious but missing hook. I can control kill signal, timeouts, retries, etc.
Currently, for some services I have to set ridiculous huge timeout because service may finish their work in 5 seconds or in 5 minutes.

Upgrade / Downgrade

Note on upgrade / downgrade hooks because such topic is really complicated:
If an user wants to deal with upgrade/downgrade he is probably should thinking in a way how to migrate from current version to new / old one. Upgrade / Downgrade may skip some versions too.
That's why when upgrading we should try new files and hooks, but when downgrading current one. There is also should be a way to get new / old pkg info. Add template data like {{pkg}} for updated package {{update_pkg}}.

CURRENT_PKG_VERSION="{{pkg.version}}" # 2.2.3
NEW_PKG_VERSION="{{update_pkg.version}}" # 4.6.2

migrate_from_1_to_2() { ... }
migrate_from_2_to_3() { ... }
migrate_from_3_to_4() { ... }

for version in $(seq "$current_major" $(("$new_major" - 1))); do
    migrate_from_"$version"_"$((version + 1))"
done

Issues

The following issues are likely related:

The text was updated successfully, but these errors were encountered:

krasnow · 2020-04-29T17:17:29Z

Thanks for the fantastic feedback. As you noted, there's a ton of open issues related to hooks. We're going to gather this up as an epic to do a holistic look at what hooks need to be added and what issues need to be fixed.

jsirex · 2020-05-29T09:05:05Z

Actually I already blocked by missing these hooks. Any implementation will be nice.
I have really obvious use-case: my service (cluster) has update strategy (rolling or at-once does not matter).

Before upgrade I want prepare cluster for upgrade. For example:

Elasticsearch - pause shards load balancing
MongoDB - gracefully step back primary
Patroni / Postgresql - switchover or pause cluster management
etc

But, currently, I cannot control this. Such upgrade will just break all.
Possible workaround is to override shutdown behavior and always do "very-very" accurate shutdown, but shutdown hook is missed too.

stevendanna · 2020-05-29T09:25:49Z

Also related: #5135

danielcbright · 2020-09-13T11:54:12Z

I ended up writing a habitat "wrapper" called cluster patcher that allows for some of the behavior you state you're blocked on currently, I'll clean it up and share it here if anyone is interested.

stale · 2021-09-13T14:40:30Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We value your input and contribution. Please leave a comment if this issue still affects you.

stale · 2021-10-20T02:44:14Z

This issue has been automatically closed after being stale for 400 days. We still value your input and contribution. Please re-open the issue if desired and leave a comment with details.

jsirex added the C-bug label Apr 27, 2020

christophermaier added Type: Bug Issues that describe broken functionality and removed C-bug labels Jul 24, 2020

stale bot added the Stale label Sep 13, 2021

stale bot closed this as completed Oct 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yet another review on hooks #7645

Yet another review on hooks #7645

jsirex commented Apr 27, 2020

krasnow commented Apr 29, 2020

jsirex commented May 29, 2020

stevendanna commented May 29, 2020

danielcbright commented Sep 13, 2020

stale bot commented Sep 13, 2021

stale bot commented Oct 20, 2021

Yet another review on hooks #7645

Yet another review on hooks #7645

Comments

jsirex commented Apr 27, 2020

Yet another review on hooks

Suitability

Stop

Upgrade / Downgrade

Issues

krasnow commented Apr 29, 2020

jsirex commented May 29, 2020

stevendanna commented May 29, 2020

danielcbright commented Sep 13, 2020

stale bot commented Sep 13, 2021

stale bot commented Oct 20, 2021