Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide stake pool desirability scores and pledge status #2781

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

HeinrichApfelmus
Copy link
Contributor

@HeinrichApfelmus HeinrichApfelmus commented Jul 29, 2021

Issue number

ADP-996, ADP-507

Overview

In this PR, we add additional metrics to the stake pool listing in order to be able to display fairer pool rankings. Specifically, we add a stake pool desirability score and the pledge status to the GET /v2/stake-pools endpoint. This information is intended to supplement the ranking of pools whose non-myopic member rewards ranking is very low.

The desirability score reflects how cost-effective and reliable the pool operator runs their pool, and how much of their own stake they are willing to pledge, but does not take the current stake delegated to the pool into account. In the long run, only the first k pools with highest desirability scores are expected to attract delegators, where k is a protocol parameter (k = 500 as of 13th Jan 2022). However, in the short term, rewards for delegators can be much lower than the desirability score, as the pool has not attracted enough stake to reach saturation; also, good estimates of the pool's block production performance may not be available yet.

The pledge status indicates whether a pool owner has delegated enough stake to their own pool in order to meet the pledge that they made when registering their pool. A pool where the owner_stake_lower_than_pledge flag is set will earn no rewards, which is a good reason to delegate elsewhere.

See also the REST API documentation for more information (available once this PR is merged).

Details

  • Example endpoint response with new fields:
{
  …,
  "metrics": {
    …,
    "desirability_score": 0,
    "owner_stake": {
      "quantity": 0,
      "unit": "lovelace"
    },
  },
  "flags": [
    "owner_stake_lower_than_pledge"
  ],
}
  • Computing the pool metrics takes a bit of time. In order to speed up subsequent queries, the wallet process caches the computation. This behavior can be turned off with the --no-cache-listpools command line flag. The cache time-to-live (TTL) can be modified with the --cache-listpools-ttl DURATION flag (default: 1 hour). See cardano-wallet --help for more information.

Comments

  • At the moment, the non-myopic rewards are used to rank stake pools in Daedalus, and the desirability score is intended to supplement this information. Research at IOG is currently looking into pool metrics that are easier to grasp and better reflect rewards in both the short and the long term.

@HeinrichApfelmus HeinrichApfelmus self-assigned this Jul 29, 2021
@HeinrichApfelmus HeinrichApfelmus marked this pull request as draft July 29, 2021 10:52
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-996/local-state-query branch from e24ae7f to 2278b8f Compare July 29, 2021 11:06
@HeinrichApfelmus
Copy link
Contributor Author

Hrm. I am not entirely happy with the invariantApiStakePool function.

With the addition of the OwnerStakeLowerThanPledge flag, the information contained in the API response is slightly redundant, but I think that the clarity of the flag to wallet frontends is worth it.

Ideally, one would now use a smart constructor in order to construct ApiStakePool values. However, the data type contains many fields, which we cannot refer to by name if we use a smart constructor. This is one of those moments where anonymous records would be very useful.

@HeinrichApfelmus HeinrichApfelmus changed the title Heinrich apfelmus/adp 996/local state query Provide stake pool desirability scores and pledge status Jul 29, 2021
@darko-mijic
Copy link

@HeinrichApfelmus, can we already get results from the API in this draft PR? Uncached obviously.

If so, we could start testing this to see if we have everything we need @daniloprates

@HeinrichApfelmus
Copy link
Contributor Author

HeinrichApfelmus commented Jul 30, 2021

@HeinrichApfelmus, can we already get results from the API in this draft PR? Uncached obviously.

If so, we could start testing this to see if we have everything we need @daniloprates

@darko-mijic Yes, you can get real, uncached results, please feel free to start testing. I created a branch HeinrichApfelmus/ADP-996/daedalus-testing which you can reference. (The pull request here will see couple of force pushs before it is merged into master). Let me know if there is any trouble.

@piotr-iohk
Copy link
Contributor

piotr-iohk commented Jul 30, 2021

@HeinrichApfelmus some ad-hoc tests I made against mainnet on this branch + node (bc225ae3085ba6f4f4007c50c4877bc4cebcd7de) vs cardano-wallet v2021-06-11 + node 1.27.0

This branch:

[cardano-wallet.network:Notice:7145] [2021-07-30 11:57:42.56 UTC] Query stakePoolsSummary took 359.389211548s (too slow)
[cardano-wallet.network:Info:7145] [2021-07-30 11:57:42.56 UTC] Fetched pool data from node tip using LSQ. Got 2860 pools in the stake distribution, and 2907 pools in the non-myopic member reward map.
[cardano-wallet.api-server:Error:7145] [2021-07-30 11:57:44.39 UTC] [RequestId 9] GET /v2/stake-pools 200 OK in 391.886787718s

v2021-06-11:

[cardano-wallet.network:Notice:69] [2021-07-30 12:12:45.79 UTC] Query stakePoolsSummary took 85.979565801s (too slow)
[cardano-wallet.network:Info:69] [2021-07-30 12:12:45.79 UTC] Fetched pool data from node tip using LSQ. Got 2860 pools in the stake distribution, and 2907 pools in the non-myopic member reward map.
[cardano-wallet.api-server:Error:69] [2021-07-30 12:12:48.18 UTC] [RequestId 7] GET /v2/stake-pools 200 OK in 113.490084737s

@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-996/local-state-query branch from 7950f5b to 2bb8130 Compare August 3, 2021 14:28
@HeinrichApfelmus
Copy link
Contributor Author

I'm trying to test commit 2bb8130 on a local cluster, but it turns out that the totalStake provided by GetRewardProvenance equals 0 there, and computations of relative stakes become divisions by zero. Ouch.

Should work on mainnet, though.

@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-996/local-state-query branch from 5dab331 to a17b8bc Compare August 6, 2021 14:57
@rvl rvl force-pushed the HeinrichApfelmus/ADP-996/local-state-query branch from 1949cc7 to 9a1171e Compare August 11, 2021 06:23
Copy link
Contributor

@rvl rvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!!

Before PR approval, we will just need some testing that the non-myopic rankings calculated by the wallet are the same as what would be calculated by the ledger. Currently I'm not sure what the best way to do this might be. Using the local cluster integration tests may be the easiest way.

I notice that the TVar caching is present but disabled.
Without caching, it would still be just as fast as before, right?

I tried testing it out locally and got some PastHorizonExceptions - not sure yet whether this is due to your changes, or because of the Alonzo update in master branch.

newStakePoolLayer gcStatus nl db@DBLayer {..} restartSyncThread = do
pure $ StakePoolLayer
(worker, _stakeDistribution) <- nooooCacheWorker hour (stakeDistribution nl)
pure $ (worker, StakePoolLayer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HLint will probably whinge about this line.

Comment on lines 282 to 283
mnlData <- stakeDistribution nl
case mnlData of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mnlData <- stakeDistribution nl
case mnlData of
stakeDistribution nl >>= \case


-- | Create a new cache and a worker to fill it.
newCacheWorker
:: Int -- ^ time period in seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:: Int -- ^ time period in seconds
:: NominalDiffTime -- ^ cache TTL

STM.atomically $ writeCache cache a
threadDelay (max 0 (seconds-3) * 1_000_000)
return (CacheWorker $ worker, STM.atomically $ readCache cache)
-- TODO: make cache worker more intelligent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be much more intelligent?
Perhaps the cache period could be defined in terms of slots/epochs.
This looks like it pre-fills the cache shortly after startup. I think that's good.
It should probably recover from exceptions in the action though (perhaps by retrying a few times).
And some debug-level tracing would be good in case anything goes wrong with the queries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should probably recover from exceptions in the action though (perhaps by retrying a few times).
And some debug-level tracing would be good in case anything goes wrong with the queries.

This is the intelligence I was looking for but couldn't think of myself. Thanks! 😊

@HeinrichApfelmus
Copy link
Contributor Author

Thanks Rodney!

Before PR approval, we will just need some testing that the non-myopic rankings calculated by the wallet are the same as what would be calculated by the ledger. Currently I'm not sure what the best way to do this might be. Using the local cluster integration tests may be the easiest way.

Well, I am very confident that I copied Eq.(2) of Section 5.5.3 in SL-D1 correctly, so if there is a discrepancy, then it is the ledger that is wrong. 😉 At least, I don't think that it's worth testing this particular equality. However, I do agree that some testing should be done. In fact, I tried the GET /v2/stake-pools endpoint on the local cluster, and I get a division by zero error. 😳 It turns out that the GetRewardProvenance query returns totalStake = 0; this is extraordinarily weird. It seems that the query returns bad data, I made a bug report in Jira about it. I'll make inquiries with the ledger team, but this appears to be a blocker for now. 🙈

I notice that the TVar caching is present but disabled.
Without caching, it would still be just as fast as before, right?

Yes. I kept it disabled it for the purpose of testing.

@rvl
Copy link
Contributor

rvl commented Aug 12, 2021

Could we conditionally disable the GRP LSQ caching by environment variable?
We already have a few of these for testing - see the wiki.

@nikolaglumac
Copy link

@HeinrichApfelmus @rvl can you please merge in the latest master branch into this PR?
AFAIK this PR doesn't use the latest available node version (1.29.0) and we need it to continue testing 🙏

@rvl rvl force-pushed the HeinrichApfelmus/ADP-996/local-state-query branch from 2667eb6 to 682439b Compare September 7, 2021 11:33
@rvl
Copy link
Contributor

rvl commented Sep 7, 2021

OK I have rebased, however I'm not sure whether cardano-node 1.29.0 contains a full fix for GetRewardProvenance total stake.

@nikolaglumac
Copy link

Many thanks @rvl!

OK I have rebased, however I'm not sure whether cardano-node 1.29.0 contains a full fix for GetRewardProvenance total stake.

Who could check this?

@HeinrichApfelmus
Copy link
Contributor Author

Thanks Rodney! It appears that there as been some work on the ledger side to fix the GetRewardProvenance issue, I tracked down IntersectMBO/cardano-ledger#2433 which has been merged. Not sure whether this has propagated to cardano-node yet.

@HeinrichApfelmus
Copy link
Contributor Author

@nikolaglumac I just learned that the fix is going to be in cardano-node 1.30.0.

@nikolaglumac
Copy link

Thanks for letting us know @HeinrichApfelmus 🙏
Please inform us once the new version of the node lands into this PR!

@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-996/local-state-query branch 3 times, most recently from 10bd668 to 2b6339c Compare September 21, 2021 15:53
@HeinrichApfelmus HeinrichApfelmus marked this pull request as ready for review September 21, 2021 15:55
@HeinrichApfelmus HeinrichApfelmus marked this pull request as draft September 21, 2021 15:56
* The `serveWallet` function gets a new argument of type `Maybe DebugConfig`
* The above environment variable is *disabled* outside of integration tests and the local cluster
* Actually cache stuff.
* The above environment variable is *disabled* outside of integration tests and the local cluster. In a production environment, the default cache TTL is 60 minutes.
* We cannot say for certain yet that the owner has allocated the stake to back their pledge
* We cannot say for certain yet that the pool has produced blocks

These assumptions will affect the desirability scores.
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-996/local-state-query branch from b4eeb0f to 543e9d4 Compare January 14, 2022 13:22
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Jan 14, 2022

try

Build failed:

HeinrichApfelmus and others added 7 commits January 14, 2022 15:26
* Add test for stake pool desirability.

* The non-myopic rewards are usually non-zero even if the user-submitted stake is zero. I have removed the integration test.

* It turns out that the total stake of the test cluster used for integration is so large that the given user stakes only make a minuscule difference. Our new non-myopic reward calculation uses `Double` internally for reasons of speed, which causes the "non-myopic rewards are based on stake" to fail due to limited precision. However, by changing the optimum number of pools from 3 to 2, we can ensure that one pool will lose out on desirability and its non-myopic rewards will change markedly with increasing user stake.

The notion of non-myopic rewards will be phased out, hence we do not need too pedantic about these integration tests.
It turns out that we cannot rely on current pool certificates that fetched from the chain — we have to use a local state query to fetch the "current" pool parameters, which contain only changes made in the *previous* epoch. Detailed explanation in the new note [RewardEpoch].
Use `import Cardano.Protocol.TPraos`
* Also add `--no-cache-listpools`
* Rename the environment variables to `CACHE_LISTPOOLS_TTL` and `NO_CACHE_LISTPOOLS`.
Not exported in the REST API yet.
… and drop the other queries related to rewards.

This depends on cardano-node` version 1.33
First `nonMyopicMemberRewards`, then `desirabilityScore`, then randomness from seed
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-996/local-state-query branch from 543e9d4 to c53f48d Compare January 14, 2022 14:26
@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-996/local-state-query branch from e7cca3c to f2ce73d Compare January 15, 2022 07:41
@HeinrichApfelmus
Copy link
Contributor Author

bors try

iohk-bors bot added a commit that referenced this pull request Jan 17, 2022
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Jan 17, 2022

try

Build succeeded:

@HeinrichApfelmus
Copy link
Contributor Author

Status update for @nikolaglumac : The pledge status and desirability status have been implemented and tested, but unfortunately, it unexpectedly turns out that the query is too slow to be useable on mainnet in node 1.33 (but this is likely to improve in newer node versions). If it's really urgent, we could reverse-engineer at least the pledge status with better performance; if so, please chat about this with Teddy, our product owner.

@nikolaglumac
Copy link

I leave this up to @daniloprates.

@HeinrichApfelmus HeinrichApfelmus force-pushed the HeinrichApfelmus/ADP-996/local-state-query branch from 4307659 to dc5b466 Compare January 19, 2022 15:18
@thedanheller
Copy link

thedanheller commented Jan 19, 2022

If it's really urgent, we could reverse-engineer at least the pledge status with better performance; if so, please chat about this with Teddy, our product owner.

@HeinrichApfelmus it would be great to have at least pledge status, if possible, although I wouldn't say it's more urgent than releasing support for Cardano Node 1.33.

iohk-bors bot added a commit that referenced this pull request Mar 25, 2022
3190: Implement new Stake Pool Ranking formulas r=HeinrichApfelmus a=HeinrichApfelmus

### Issue number

ADP-1510, ADP-1491

### Overview

In this pull request, we implement the new formulas for ranking stake pools. These comprise two numerical quantities and one notification message:

* `currentROS` = The return of stake (annualized) expected when delegating to the pool right now.
* `saturationROS` = The return of stake (annualized) expected when delegating to the pool if the pool were saturated and had `1/k` of the total stake delegated to it, which is the ideal outcome for this pool in the future.
* A *redelegation warning* that encourages the user to redelegate to a different pool if the performance of the current pool becomes lackluster. Such a notification is necessary because the above rankings do *not* take into account block production performance.

While this pull request implements the formulas, it does *not* yet apply them to actual blockchain data, or expose them via the REST API; this is subject to future pull requests closely related to PR #2781.

However, this pull request also includes code for estimate the pool performance from past pool production data, so that these formulas can also be used in [light-mode][] (Epic ADP-1422). This code was mostly copied from the existing implementation in the cardano-ledger repository.

  [light-mode]: https://input-output-hk.github.io/cardano-wallet/design/specs/light-mode

### Details

* Testing `currentROS` and `saturationROS` has been done manually, as there are no tests that are sufficiently independent of the definition.
* Likewise, the redelegation warning is hard to test, but basic sanity checks are performed to detect typos.
* Likewise, the performance estimation was tested manually on mainnet, and relies on past testing of the likelihood estimation code in the ledger.
  
### Comments

* The "redelegation warning" was originally called "redelegation notification". I somehow started calling it a "warning". Is that good UX or does that contribute to "warning fatigue"?
* The present code does not yet consider warnings from past epochs. The specification is to make the warning sticky and also display it if it was triggered in one of the two previous epochs, but add a note in this case.
* The present code also does not yet take into account parameter changes that affect pool performance across epochs.

Co-authored-by: Heinrich Apfelmus <heinrich.apfelmus@iohk.io>
Co-authored-by: IOHK <devops+stack-project@iohk.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants