Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sort unbonding prioritization by validator capacity #1018

Merged
merged 13 commits into from
Dec 27, 2023

Conversation

sampocs
Copy link
Collaborator

@sampocs sampocs commented Dec 6, 2023

Context

Until unbonding from 32+ validators in a single batch is support, this sorts by capacity first to reduce the number of messages.

Changelog

  • Removed condition to sort unbondings by balance ratio
  • Added ConsolidateUnbondingMessages in the event that the number of messages is still greater than 32. This function will redistribute the remainder proportionally to each validator in the batch.

Testing

  • Tested with the unbonding simulator and a large unbond amount (3x the current mainnet amount):
    • Using mainnet code: 92 messages
    • Using just the sorting change: 34 messages
    • Using this branch: 32 messages

@sampocs sampocs changed the title sort unbonding prioiritzation by validator capacity sort unbonding prioritization by validator capacity Dec 7, 2023
Copy link
Contributor

@ethan-stride ethan-stride left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss assumptions around the incoming variables

x/stakeibc/keeper/unbonding_records.go Show resolved Hide resolved
Copy link
Contributor

@ethan-stride ethan-stride left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion on Friday, I think the algorithm is sound and makes sense.
Maybe a few renaming suggestions to make it clearer what the summed variables represent and maybe a few more safety checks that this unbonding is possible on a per validator level instead of the macro assumption that unbondings are always small compared to the total amount delegated in the capped set.

In practice I think these are safe assumptions given the typically unequal delegations across validators and the assumption about the initialUnbondings coming in sorted, but there are some bad edge cases where this type of undelegation is not possible even in theory like equal delegation across 33 validators, batchsize = 32 and someone is trying to unbond everything for example

x/stakeibc/keeper/unbonding_records.go Show resolved Hide resolved
x/stakeibc/keeper/unbonding_records.go Outdated Show resolved Hide resolved
x/stakeibc/keeper/unbonding_records.go Outdated Show resolved Hide resolved
x/stakeibc/keeper/unbonding_records.go Outdated Show resolved Hide resolved
x/stakeibc/keeper/unbonding_records.go Outdated Show resolved Hide resolved
x/stakeibc/keeper/unbonding_records.go Show resolved Hide resolved
x/stakeibc/keeper/unbonding_records.go Show resolved Hide resolved
@sampocs sampocs force-pushed the sam/reorder-unbonding-priority branch from 19df72e to 2c39148 Compare December 13, 2023 23:21
@sampocs sampocs force-pushed the sam/reorder-unbonding-priority branch from 2c39148 to 031afe1 Compare December 13, 2023 23:23
@ethan-stride
Copy link
Contributor

LGTM
Looks good in that this is a temporary fix and the times we would fail with error (ie. respond we can't unbond this incoming request) are very unlikely situations in practice. We are really scraping weird edge cases and most of the time this will work without any issue. I think everything is sound and your modifications made the algorithm conceptually clearer.

Copy link
Contributor

@riley-stride riley-stride left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConsolidateUnbondingMessages and GetUnbondingICAMessages are looking good, and thoroughly tested!

Copy link
Contributor

@shellvish shellvish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for making this! I think this will make the unbonding algorithm significantly more resilient going forward.

The one edge case I'm thinking through is if the validator's with the highest capacities are NOT the validators with the highest delegated amounts. In this case, the algorithm might return a failure if we're unable to unbond from the highest capacity validators, when we would have instead been able to unbond from the highest delegated validators.

That being said, I think this is a completely fine tradeoff. In general, these two sets should have very high overlap (given the capacity is calculated AFTER the unbonding processes, so validators with highest delegation should have the highest capacity as well), and in cases where they differ, the capacity is the "conceptually cleaner" way to undelegate. It should be extraordinarily rare that this algorithm would fail.

Also, just want to shoutout the detail in the comments, makes reading and understanding this algorithm so much easier, and the code quality is very high (especially given the complexity of the code). Tests also look excellent.

x/stakeibc/keeper/unbonding_records.go Show resolved Hide resolved
x/stakeibc/keeper/unbonding_records.go Outdated Show resolved Hide resolved
x/stakeibc/keeper/unbonding_records.go Outdated Show resolved Hide resolved
@sampocs sampocs force-pushed the sam/reorder-unbonding-priority branch from fb122c5 to 50ddeb7 Compare December 18, 2023 18:52
@sampocs
Copy link
Collaborator Author

sampocs commented Dec 18, 2023

The one edge case I'm thinking through is if the validator's with the highest capacities are NOT the validators with the highest delegated amounts. In this case, the algorithm might return a failure if we're unable to unbond from the highest capacity validators, when we would have instead been able to unbond from the highest delegated validators.

Yeah very good callout! My mental model for this is that for non-LSM chains, this algorithm should behave no different than just splitting messages purely based on weights. And for LSM chains, we target the validators with the most LSM stake first.

So this failure case occurs if the validators with the most LSM stake also all happen to be 0 weight - which in practice, it should be the opposite!

Also, each time we rebalance, this gets "reset"

@shellvish

@sampocs sampocs added the v17 label Dec 19, 2023
@sampocs sampocs merged commit 8f37be3 into main Dec 27, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants