You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a ticket to replace the current halt counter with a bit flag that lives in nonverifiable storage, set by signal_halt and unset by running a migration.
Currently, every call to signal_halt must be matched with a manual counter increment in penumbra_app::TOTAL_HALT_COUNTER. The purpose of this mechanism is preventing unwanted node restarts, and progress following a halt signal. Operationally, this turned out to not be a tenable setup and @hdevalence suggested that we replace the counter with a single bit flag, eliminating the failure mode of running a binary with a TOTAL_HALT_COUNTER well ahead of the internal signal halt counter.
To do this, we need to:
Replace the halt_count state key with halt_bit with path: "governance/persistent_flags/halt_bit"
Replace the signal_halt implementation to set the halt_bit on (in nonverifiable storage)
Add a ready_to_start implementation to set the halt_bit off
Make is_chain_halted return the halt_bit
Make pd migrate check is_chain_halted and return early if it is, this should happen before any specific migration variant is called, later we can add an APP_VERSION check for each special case.
Add a --force flag to pd migrate so that we can optionally ignore the halt bit flag
The method names are suggestions, but they should all be documented so that their effect on application behavior can be understood.
The text was updated successfully, but these errors were encountered:
## Describe your changes
This PR implements #4373
## Checklist before requesting a review
- [x] If this code contains consensus-breaking changes, I have added the
"consensus-breaking" label. Otherwise, I declare my belief that there
are not consensus-breaking changes, for the following reason:
> This is consensus breaking because it deprecates the application halt
counter stored within the verifiable chain state. However, it does not
require a state migration because the switch starts in the "off" (i.e.
`false`) position. As a result, everything else being equal, using a new
upgraded binary should be sufficient.
---------
Signed-off-by: Erwan Or <erwan.ounn.84@gmail.com>
This is a ticket to replace the current halt counter with a bit flag that lives in nonverifiable storage, set by
signal_halt
and unset by running a migration.Currently, every call to
signal_halt
must be matched with a manual counter increment inpenumbra_app::TOTAL_HALT_COUNTER
. The purpose of this mechanism is preventing unwanted node restarts, and progress following a halt signal. Operationally, this turned out to not be a tenable setup and @hdevalence suggested that we replace the counter with a single bit flag, eliminating the failure mode of running a binary with aTOTAL_HALT_COUNTER
well ahead of the internal signal halt counter.To do this, we need to:
halt_count
state key withhalt_bit
with path:"governance/persistent_flags/halt_bit"
signal_halt
implementation to set thehalt_bit
on (in nonverifiable storage)ready_to_start
implementation to set thehalt_bit
offis_chain_halted
return thehalt_bit
pd migrate
checkis_chain_halted
and return early if it is, this should happen before any specific migration variant is called, later we can add anAPP_VERSION
check for each special case.--force
flag topd migrate
so that we can optionally ignore the halt bit flagThe method names are suggestions, but they should all be documented so that their effect on application behavior can be understood.
The text was updated successfully, but these errors were encountered: