Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates User Redemption Records dynamically before unbonding #1049

Closed
wants to merge 7 commits into from

Conversation

shellvish
Copy link
Contributor

@shellvish shellvish commented Jan 9, 2024

WIP - Still need to add unit tests.

High level, this change makes it so redemption rates get set at the time that unbondings are initiated, not at the time of redemption.

I would focus on a few main things:

  • Does it ever break any assumptions if the NativeTokenAmount on a HostZone is set to 0 before unbondings finish?
  • Do we handle failure cases (e.g. callback errors or times out) correctly?
  • Do we migrate our current data structures over properly?

Copy link
Contributor

@riley-stride riley-stride left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, thanks for taking point on this one! Added some questions and suggestions.

@@ -74,6 +74,14 @@ func (k Keeper) CreateEpochUnbondingRecord(ctx sdk.Context, epochNumber uint64)
return true
}

// Helper function to evaluate if a host zone unbonding record still needs to be initiated
func (k Keeper) ShouldInitiateHostZoneUnbondingRecord(ctx sdk.Context, hostZoneRecord *recordstypes.HostZoneUnbonding) bool {
if hostZoneRecord.Status == recordstypes.HostZoneUnbonding_UNBONDING_QUEUE && hostZoneRecord.NativeTokenAmount.GT(sdkmath.ZeroInt()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will NativeTokenAmount always be 0 now, causing this check to always fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦

Yes, you're 100% right. Great catch, fixing!

Copy link
Contributor

@riley-stride riley-stride Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your fix here works, since NativeTokenAmount will (once set) equal stTokenAmount times a non-zero RR. So checking stTokenAmount for equality to 0 has the same effect as checking whether the value NativeTokenAmount will eventually be set to equals 0.

x/stakeibc/keeper/msg_server_redeem_stake.go Show resolved Hide resolved
x/stakeibc/keeper/msg_server_redeem_stake.go Show resolved Hide resolved
hostZoneUnbondingRecord.NativeTokenAmount = totalNativeAmount
updatedEpochUnbondingRecord, success := k.RecordsKeeper.AddHostZoneToEpochUnbondingRecord(ctx, epochNumber, chainId, hostZoneUnbondingRecord)
if !success {
// TODO QUESTION: what should we do on failure? do we revert the userRedemptionRecord changes above?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Low confidence) I think we should either make all the changes in this function or none of them, so would revert (error) rather than returning false.

I see in one call to UpdateNativeTokensForHostZoneUnbondingRecord you throw an error upstream here. However in another call to it here you don't throw an error upstreadm (afaict).

I think we want to revert changes always so would maybe suggest throwing the error inside UpdateNativeTokensForHostZoneUnbondingRecord so that future engineers don't have to think about throwing errors on UpdateNativeTokensForHostZoneUnbondingRecord failure.

What might cause this function to fail? Only case I can see is if GetEpochUnbondingRecord fails to find the record, which seems impossible given it's passed in from an EpochUnbondingRecord as epochUnbonding.EpochNumber

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense! The reason I threw an error in the first location is because we need that function call to succeed (otherwise we don't know how many tokens to unbond), but we don't strictly need it in the other function call (it's only used for housekeeping niceness).

That being said, I lean towards throwing an error regardless. I put the tradeoffs in here, my recommendation is we get one-notch more confident that the function won't fail, and then we choose to always revert.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comments below at the link. (Low confidence) I think the records could get out of sync if we don't revert. Agree we should get one-notch more confident that the function won't fail, and then we choose to always revert.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we throw an error, state doesn't get reverted since this is in the begin blocker. We should use a cache context wrapper around this function and throw an error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not necessarily in the begin blocker - re-opening ICAs is just a tx

return errorsmod.Wrapf(types.ErrHostZoneNotFound, errMsg)
}
success := k.UpdateNativeTokensForHostZoneUnbondingRecord(ctx, epochUnbondingRecordId, hostZoneUnbonding, true)
if !success {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we throw an error here to revert state if UpdateNativeTokensForHostZoneUnbondingRecord fails, or maybe throw an error within UpdateNativeTokensForHostZoneUnbondingRecord if it fails?

Copy link
Contributor Author

@shellvish shellvish Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering this as well, I think the tradeoff is

  1. Currently, we don't actually need the output of UpdateNativeTokensForHostZoneUnbondingRecord to succeed, it's mainly housekeeping to keep the records neat. If this failed for whatever reason, it would be nice to not halt the chain / break critical functionality
  2. But... in the future it's very possible that we will rely on this state, and then we absolutely do need to revert state if this failed. It's very possible that by not reverting it now, we won't catch an error, and then the impact will be much more severe in the future.

I don't feel strongly on which path we take, could easily be convinced to revert state or just log an error. Curious what you, @asalzmann , and @sampocs think!

Also, realistically I don't think this function should fail at all. imo if we all looked hard at this function for a few minutes and were confident it shouldn't error, I think option (2) is a no-brainer!

Copy link
Contributor

@riley-stride riley-stride Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Currently, we don't actually need the output of UpdateNativeTokensForHostZoneUnbondingRecord to succeed, it's mainly housekeeping to keep the records neat. If this failed for whatever reason, it would be nice to not halt the chain / break critical functionality

If the highligted line failed (the call to AddHostZoneToEpochUnbondingRecord) but we didn't revert, would we end up in a bad state?

Afaict the totalNativeAmount would have been incremented and written to the URR a few lines prior, but the corresponding HZUR would not have been updated, so the two would be out of sync.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned in the other thread as well - state does not get reverted in the begin/end blocker even if we return an error. We should wrap this in a temporary context so we can discard state changes if an error is thrown. I believe this is already implemented in an old PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah that's a great point Riley! Agreed, I think these will get out-of-sync the way it's currently written

Sam, do you have an example of the temporary context? I can implement it here, I think that's likely the best path forward

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above - this is called outside of the begin blocker (acks, tx to re-open ICAs)

@@ -66,6 +66,11 @@ func (k Keeper) UndelegateCallback(ctx sdk.Context, packet channeltypes.Packet,
k.Logger(ctx).Error(utils.LogICACallbackStatusWithHostZone(chainId, ICACallbackID_Undelegate,
icacallbackstypes.AckResponseStatus_FAILURE, packet))

// Set NativeTokenAmounts on these HZUs to 0
if err := k.SetNativeTokensToZeroInUnbondingRecords(ctx, chainId, undelegateCallback.EpochUnbondingRecordIds); err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we throw an error here to revert state if downstream UpdateNativeTokensForHostZoneUnbondingRecord fails, or maybe throw an error within UpdateNativeTokensForHostZoneUnbondingRecord if it fails?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if an error is thrown here, I don't think any state will be committed

app/upgrades/v17/upgrades.go Show resolved Hide resolved
// store if the unbonding has not been initiated
unbondingNotInitiated := hostZoneUnbonding.Status == recordtypes.HostZoneUnbonding_UNBONDING_QUEUE

// Loop through User Redemption Records and insert an estimated stTokenAmount
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want update records that have the following statuses all in the same way?

  1. HostZoneUnbonding_UNBONDING_QUEUE
  2. HostZoneUnbonding_UNBONDING_IN_PROGRESS <= is there a risk we are going to catch any of these in flight?
  3. HostZoneUnbonding_EXIT_TRANSFER_QUEUE
  4. HostZoneUnbonding_EXIT_TRANSFER_IN_PROGRESS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a great question! Thinking through each of these cases:

  1. HostZoneUnbonding_UNBONDING_QUEUE, i.e. tokens have been redeemed by the user but NOT unbonded. I think we do want to convert these (e.g. move all nativeTokenAmounts to stTokenAmounts)
  2. HostZoneUnbonding_UNBONDING_IN_PROGRESS, i.e. there is currently an unbonding in flight. I don't think anything bad will happen here, as all we're doing is adding a variable (e.g. stTokenAmount), but not modifying any existing variables.
  3. HostZoneUnbonding_EXIT_TRANSFER_QUEUE, i.e. unbonding has been initiated, waiting for unbonding to finish. Feels similar to (2) in that we're adding fields (stTokenAmount) but not removing any. I don't think stTokenAmount will be used after the unbonding has been initiated, but please double check me on that!
  4. HostZoneUnbonding_EXIT_TRANSFER_IN_PROGRESS, i.e. unbonding has finished, tokens are being swept. Same as (2) and (3), I think the extra field won't modify anything.

Please double check me though that the extra fields won't mess with anything!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yep, makes sense.
  2. Thinking about this a bit more, not only will nothing bad happen here, in fact we need to migrate these or their undelegate callbacks will fail because in UndelegateCallback we invoke SetNativeTokensToZeroInUnbondingRecords which references their stTokenAmount attribute. If they have no stTokenAmount that callback will return err.
  3. Agreed, we should migrate it. And feels safe: I can't find a reference to stTokenAmount in any code post-UndelegateCallback. Imo this is the same as 4 but different than 2 (2 needs the migration to avoid erroring).
  4. Yeah I think we should migrate it. Imo this is the same as 3 but different than 2 (2 needs the migration to avoid erroring).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah great call re:(2)! Let's keep this as is then, seems like the migration is good for all of the cases

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed let's keep it as is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this open so @sampocs and @asalzmann can think through each case with this context as well. Feel free to resolve after.

app/upgrades/v17/upgrades.go Show resolved Hide resolved
Copy link
Collaborator

@sampocs sampocs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry kinda running out of steam rn, but just dropping this partial review for now and will finish the review first thing in the morning!

Comment on lines +79 to +82
if hostZoneRecord.Status == recordstypes.HostZoneUnbonding_UNBONDING_QUEUE && hostZoneRecord.StTokenAmount.GT(sdkmath.ZeroInt()) {
return true
}
return false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if hostZoneRecord.Status == recordstypes.HostZoneUnbonding_UNBONDING_QUEUE && hostZoneRecord.StTokenAmount.GT(sdkmath.ZeroInt()) {
return true
}
return false
return hostZoneRecord.Status == recordstypes.HostZoneUnbonding_UNBONDING_QUEUE && hostZoneRecord.StTokenAmount.GT(sdkmath.ZeroInt())

@@ -96,6 +111,74 @@ func (k Keeper) GetTotalUnbondAmountAndRecordsIds(ctx sdk.Context, chainId strin
return totalUnbonded, unbondingRecordIds
}

// Update Native Amounts in the Host Zone Unbonding Record (and associated User Redemption Records) to reflect the current redemption rate
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple nit comments in this PR to clean things up a bit and keep things standardized:

  • Use .Attribute instead of .GetAttribute()
  • Add some vertical whitespace between logical groups of logic (makes the review much easier! - similar to new line breaks in a paragraph)
  • We don't have to do that errMsg variable format anymore, and no need to log each error cause their logged upstream, we can just return the wrapped error

hostZoneUnbondingRecord.NativeTokenAmount = totalNativeAmount
updatedEpochUnbondingRecord, success := k.RecordsKeeper.AddHostZoneToEpochUnbondingRecord(ctx, epochNumber, chainId, hostZoneUnbondingRecord)
if !success {
// TODO QUESTION: what should we do on failure? do we revert the userRedemptionRecord changes above?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we throw an error, state doesn't get reverted since this is in the begin blocker. We should use a cache context wrapper around this function and throw an error

return errorsmod.Wrapf(types.ErrHostZoneNotFound, errMsg)
}
success := k.UpdateNativeTokensForHostZoneUnbondingRecord(ctx, epochUnbondingRecordId, hostZoneUnbonding, true)
if !success {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned in the other thread as well - state does not get reverted in the begin/end blocker even if we return an error. We should wrap this in a temporary context so we can discard state changes if an error is thrown. I believe this is already implemented in an old PR

@@ -74,6 +74,14 @@ func (k Keeper) CreateEpochUnbondingRecord(ctx sdk.Context, epochNumber uint64)
return true
}

// Helper function to evaluate if a host zone unbonding record still needs to be initiated
func (k Keeper) ShouldInitiateHostZoneUnbondingRecord(ctx sdk.Context, hostZoneRecord *recordstypes.HostZoneUnbonding) bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on renaming to ShouldHostZoneRecordUnbond?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo the proposed change is more clear.

k.Logger(ctx).Error(utils.LogWithHostZone(chainId, "Failed to update native tokens for host zone unbonding record in epoch %d", epochUnbonding.EpochNumber))
continue
}

totalUnbonded = totalUnbonded.Add(hostZoneRecord.NativeTokenAmount)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure this summation will work? I think the native amount might always be 0 here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eye. Agreed I don't think this will work. @shellvish @sampocs should we run this PR through the dockernet integration tests to make sure the high level flow works? Would be nice to have that guarantee as we review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I take that back. hostZoneRecord is passed into UpdateNativeTokensForHostZoneUnbondingRecord and updated here. I think this does work.

Copy link
Collaborator

@sampocs sampocs Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should try and clean this up a bit. It's a bit dangerous to update pointers inside functions and then make assumptions in the calling function

In general, I think we should either:

  • Return the struct from the updating function, or
  • Write to state in the updating function, and read that value back in the calling function

I'll take a peak at what makes the most sense here. Unfortunately, the EUR/HZU schema makes each approach a bit messy

@sampocs
Copy link
Collaborator

sampocs commented Jan 10, 2024

Considering our unbonding flow is already pretty tricky, I think we should optimize for readability over performance.

One paradigm that I think helps remove implicit assumptions is having a "main" like function (UnbondFromHost) and then having the sub functions be small/focused and explicit about whether they are just doing calculations or are updating state.

That said, I think we should disentangle this SetNativeAmount functions from GetTotalUnbondAmountAndRecordsIds. So in the main we have something like:

// set the native amount for each EUR
// sum the total native amount 

This requires we do another loop, but then we don't have to modify the existing GetTotalUnbondAmountAndRecordsIds function at all, we can keep it focused on the one task, and we don't have to rely on pointers/global side effects.

If you're all in agreement with this, I can open a PR to this one

Copy link
Contributor

@asalzmann asalzmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't have time to review the migration logic yet (but planning on doing that last, since it's more straightforward than implementing the functions themselves)

  • Do we handle failure cases (e.g. callback errors or times out) correctly?

I think so

  • Do we migrate our current data structures over properly?

Haven't checked yet

  • Does it ever break any assumptions if the NativeTokenAmount on a HostZone is set to 0 before unbondings finish

Can you clarify this question? Do you mean HostZoneUnbonding?

If so, I think we clear out EURs when all HZUs have native_token_amount == 0

@@ -18,6 +18,10 @@ message UserRedemptionRecord {
uint64 epoch_number = 7;
bool claim_is_pending = 8;
reserved 2;
string st_token_amount = 9 [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add comment explaining what values to expect for st_token_amount and amount

roughly

  • st_token_amount is set upon unbonding
  • amount is set to 0 when the URR is created, and set to an amount to be unbonded when the unbonding is triggered. if it's in status UNBONDING_QUEUE, it should be 0, if it's in any other status - UNBONDING_IN_PROGRESS or later - it should be set to an amount

@@ -66,6 +66,11 @@ func (k Keeper) UndelegateCallback(ctx sdk.Context, packet channeltypes.Packet,
k.Logger(ctx).Error(utils.LogICACallbackStatusWithHostZone(chainId, ICACallbackID_Undelegate,
icacallbackstypes.AckResponseStatus_FAILURE, packet))

// Set NativeTokenAmounts on these HZUs to 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a comment above in line 56 ("// No need to reset the unbonding record status...") that explains we also don't need to reset the amount to 0 on timeout, because it's set when the ICA channel is re-opened

@@ -66,6 +66,11 @@ func (k Keeper) UndelegateCallback(ctx sdk.Context, packet channeltypes.Packet,
k.Logger(ctx).Error(utils.LogICACallbackStatusWithHostZone(chainId, ICACallbackID_Undelegate,
icacallbackstypes.AckResponseStatus_FAILURE, packet))

// Set NativeTokenAmounts on these HZUs to 0
if err := k.SetNativeTokensToZeroInUnbondingRecords(ctx, chainId, undelegateCallback.EpochUnbondingRecordIds); err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if an error is thrown here, I don't think any state will be committed

// Revert UNBONDING_IN_PROGRESS records to UNBONDING_QUEUE
err := k.RecordsKeeper.SetHostZoneUnbondings(ctx, hostZone.ChainId, epochNumberForPendingUnbondingRecords, recordtypes.HostZoneUnbonding_UNBONDING_QUEUE)
err = k.RecordsKeeper.SetHostZoneUnbondings(ctx, hostZone.ChainId, epochNumberForPendingUnbondingRecords, recordtypes.HostZoneUnbonding_UNBONDING_QUEUE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename SetHostZoneUnbondings -> SetHostZoneUnbondingsStatus for clarity?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, would it make sense to consolidate SetNativeTokensToZeroInUnbondingRecords and SetHostZoneUnbondings?

if err != nil {
errMsg := fmt.Sprintf("unable to set native token amounts to 0 for chainId: %s and epochUnbondingRecordIds: %v, err: %s")
k.Logger(ctx).Error(errMsg)
// TODO Question: should we return an error here, or just proceed?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo we should return an error - we don't want to re-open the ICA channel if we're in a partially recovered state.

}
// update the amount, based on the current redemption rate (or set to zero if desired)
nativeAmount := sdkmath.ZeroInt()
if !setNativeAmountsToZero {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo we shouldn't mix this into the function - we should have

UpdateNativeTokensForHostZoneUnbondingRecord
ResetNativeTokensForHostZoneUnbondingRecord

and these could optionally call a sub function

the current setup of SetNativeTokensToZeroInUnbondingRecords calling UpdateNativeTokensForHostZoneUnbondingRecord is confusing

hostZoneUnbondingRecord.NativeTokenAmount = totalNativeAmount
updatedEpochUnbondingRecord, success := k.RecordsKeeper.AddHostZoneToEpochUnbondingRecord(ctx, epochNumber, chainId, hostZoneUnbondingRecord)
if !success {
// TODO QUESTION: what should we do on failure? do we revert the userRedemptionRecord changes above?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not necessarily in the begin blocker - re-opening ICAs is just a tx

return errorsmod.Wrapf(types.ErrHostZoneNotFound, errMsg)
}
success := k.UpdateNativeTokensForHostZoneUnbondingRecord(ctx, epochUnbondingRecordId, hostZoneUnbonding, true)
if !success {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above - this is called outside of the begin blocker (acks, tx to re-open ICAs)

@sampocs
Copy link
Collaborator

sampocs commented Jan 11, 2024

closing in favor of #1053

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants