Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interaction between gas / spam prevention and reward distribution #2764

Closed
cwgoes opened this issue Nov 11, 2018 · 20 comments
Closed

Interaction between gas / spam prevention and reward distribution #2764

cwgoes opened this issue Nov 11, 2018 · 20 comments
Labels
C:x/distribution distribution module related T: Security

Comments

@cwgoes
Copy link
Contributor

cwgoes commented Nov 11, 2018

Much of the disincentive for the proposer to submit spam transactions is based on the presumption that the majority of fees will be paid to other stakeholders through fee distribution (as opposed to the proposer getting all the fees) - as long as this assumption holds, we don't need to worry as much about block gas limits, correct compute/storage pricing, etc.

However, with our current fee distribution algorithm, I do not think it does. Because of the way in which the lazy calculations must be performed, a validator is forced to withdraw their accumulated fees whenever their power changes - which anyone can cause by delegating to them!

This means that the proposer of a block (or anyone, but this is most problematic when done by the proposer) can:

  1. Delegate/undelegate/redelegate a tiny amount to all the other validators, withdrawing all their fee pool accum.
  2. Include extremely compute/storage expensive transactions (which they can precompute, so incur no hardware cost on the proposer) with standard payments for gas.
  3. Withdraw all their rewards on the next block - since the proposer held all the accum in the global pool, they get their fee payments back.

In the general case, I think our fee distribution system is exploitable by any adversary who can predict future fees. Delegators with substantial stake can perform this attack too (on behalf of their validator), or any transaction submitter could arrange off-chain with the proposer for a kickback.

cc @rigelrozanski @ValarDragon

@ValarDragon
Copy link
Contributor

This hones in on the need for the removal of the global fee pool. Perhaps this should be a focus soon after launch? (Or alternatively just adopt a fully approximation free algorithm, like F1)

The only immediate solution I see to this is that change in any validator's stake should cause a withdraw from every validator's accum. This isn't that costly, if the iteration is punted until end block so that its a single iteration.

@rigelrozanski
Copy link
Contributor

rigelrozanski commented Nov 12, 2018

Interesting Scenario @cwgoes ! :)


a validator is forced to withdraw their accumulated fees whenever their power changes

The way this is worded makes this potentially an incorrect statement depending on what you mean by withdraw. The validator is only forced to move accum from the global pool to their internal validator pool whenever a change in power occurs. this does not mean that they are actually required to withdraw their rewards to their liquid account. In addition to this - this process of moving accum from global -> to local validator pool always occurs for the validator which is proposing. Which means that the most extreme situation in which the this global pool could be non-withdrawn from would be the number of blocks between being a proposer (ex. 99 blocks for an equally-weighted-validators scenario)


Just going to make some commentary on your scenario here:

  1. Delegate/undelegate/redelegate a tiny amount to all the other validators, withdrawing all their fee pool accum.

Okay, yes, they could hypothetically make everyone withdraw from the global fee pool this way, whatever is left in the fee pool is then by definition owed to them - as they are the only proposer which has not withdrawn yet.

  1. Include extremely compute/storage expensive transactions (which they can precompute, so incur no hardware cost on the proposer) with standard payments for gas.

OKAY so they should be paying a large amount of gas here, yes I'm following.

  1. Withdraw all their rewards on the next block - since the proposer held all the accum in the global pool, they get their fee payments back.

This is incorrect, because the the next block, all validator proportionally gain more global-pool-accum, which means that each validator still gets a proportional amount of their fees for performing this expensive distribution - as per the social distribution mechanism. So although the proposer may have more global accum in the fee pool because everyone else has withdrawn theirs already, a whole new round of accum is entering into the system which the new fees must be applied to.

To summarize, there is absolutely no point in which any validator performs a computation and does not gain accum for their actions. This scenario which is already an edge case with minor impact AFAICT can easily further mitigated by just making the proposer withdraw their accum before the transaction, thus they will have equal social distribution as everyone else for their expensive transaction (and thus be losing many fees).

^ this is an easy fix which only should be a few lines of code change, no reason we can't fix prelaunch

@rigelrozanski
Copy link
Contributor

Just chatted with @ValarDragon summary of our thinking:

  • The solution I described above can still be circumvented if:
    • One actor controlled 2 validators
    • This actor made all the other validators besides it's validators withdraw from the global pool
    • This actor submitted its "heavy" transaction with one of it's validators when it was their turn to propose
    • At this point, if the above solution was implemented, although the proposer would have it's accum removed, the second validator could still get many fees back (but not all as explained in my previous comment)

So in conclusion, the above scenario mitigates the problem for an actor who can only control a single validator, but if an actor is rich enough to control 2 validators then this problem still persists.


Another thing we discussed is how much a malicious validator could actually receive from this kind of a mechanism:

  • under an equal distribution scenario of 100 validators of equal power
  • 1 malicious validator could potentially accumulate UP TO 100 accum if they had no power changes for an entire round
  • assuming they performed a transaction to everyone else this means that they would then have 101 accum, on heavy block, but also 100 accum would be added to the global pool for all the other validators according to the social distribution.
  • THUS the validator could get back about 50% of their fees, making this a poor attack vector, however a decent way to reduce the amount of tx fees for a computation they wanted to take on anyways.

In conclusion possible courses of action are:

  • Withdraw all global pool accum each time any validator changes power (as suggested by @ValarDragon earlier)
  • Do this ^ every block in the existing iteration loop which means this global pool just wouldn't exist at all
  • Implement the solution of my previous comment, which patches the problem, because an actor with two validators could still perform this attack to reduce the cost of an expensive computation they wished to perform
  • Do nothing (my vote and I think @ValarDragon's too) as this is maybe a more extreme edge case, and focus on upgrading the entire mechanism after launch.

@rigelrozanski
Copy link
Contributor

just changed the labels :P feel free to challenge though :)

@cwgoes
Copy link
Contributor Author

cwgoes commented Nov 12, 2018

Thanks for the excellent documentation @rigelrozanski! I'm not quite convinced we've solved this yet, let's see.

A few clarifying questions:

The validator is only forced to move accum from the global pool to their internal validator pool whenever a change in power occurs. This does not mean that they are actually required to withdraw their rewards to their liquid account.

Presently, they are forced to withdraw their rewards to their liquid account - the onValidatorModified hook calls WithdrawValidatorRewardsAll. I think that's unnecessary - we should be able to just update the dist info, right? - maybe we should change this.

At minimum, for correct lazy accounting, they're forced to withdraw tokens to the ValidatorDistInfo, which go into DelPool and ValCommission - that all happens in TakeFeePoolRewards.

Which means that the most extreme situation in which the this global pool could be non-withdrawn from would be the number of blocks between being a proposer (ex. 99 blocks for an equally-weighted-validators scenario)

Shouldn't it be more or less constant, since proportionally larger validators will propose proportionally more frequently but also accumulate proportionally more accum?

This is incorrect, because the the next block, all validator proportionally gain more global-pool-accum, which means that each validator still gets a proportional amount of their fees for performing this expensive distribution - as per the social distribution mechanism. So although the proposer may have more global accum in the fee pool because everyone else has withdrawn theirs already, a whole new round of accum is entering into the system which the new fees must be applied to.

That's true, the total accum will always be updated, which reduces the amount of fees recovered by the ratio of the total accum to the accum of the Byzantinely-withdrawing validator(s).

So in conclusion, the above scenario mitigates the problem for an actor who can only control a single validator, but if an actor is rich enough to control 2 validators then this problem still persists.

I think that the higher number of validators you have, the more fees that can be recovered, although executing the attack becomes more difficult. If I have three validators, I can schedule their proposals to be one after another, withdraw all but the first two and then get a higher ratio of accum in my colluding validators to total accum when I withdraw after the mega-fee block.

Presuming the ability to schedule my validators to propose in order (which is non-trivial but possible), the fraction of fees recovered should be n / (n + 1), where n is the number of validators colluding, since I will have approximately the same amount of accum as the global accum in each of the validators - 1/2 in the one-validator case, 2/3 with two validators, etc.

This is without your force-proposer-withdraw fix, I think your fix changes it to n - 1 / n instead.

I don't know how likely this kind of multi-validator colluson is, it's hard to quantify and also requires some influence over Tendermint proposer election (which is not well characterized and is still in flux).

Thus the validator could get back about 50% of their fees, making this a poor attack vector, however a decent way to reduce the amount of tx fees for a computation they wanted to take on anyways.

I think this is true if and only if there is no way for a proposer to delay their own slot without also withdrawing rewards. Depending on pending Tendermint changes - tendermint/tendermint#2718 - it may be possible for validators to intentionally delay their slot - but not without triggering withdrawal hooks due to changes in power, so that seems safe.

If we assume no slot delays, and additionally assume nonexistence of multi-validator collusion (^^), this calculation matches my understanding.

Definitely this does reduce the degree of the direct DoS vulnerability. However, direct DoS is not the only problem here - it's just the most obvious one. Anywhere I as a validator can gain information about future fees, I can use it to my advantage. I think rational validators, whenever presented with a transaction that pays any nonzero fee and enough free space in the block, will include a bunch of tiny zero-fee delegation transactions (which they can do, as proposer) so they can get more of the (actual, paid by another user) fee.

If every validator does that, I suppose the equilibrium isn't too unfair - but it does mean that we give up all the benefits of lazy fee calculation, because we're just incentivization validators to force it every block. Of course, we're also creating an incentive for the next-block proposer to censor the withdraw transaction of the previous-block proposer (who performed the attack) - I'm not sure what the equilibrium is, this is starting to get into complex multiplayer game theory, but that doesn't necessarily mean we can safely ignore it.

I have some thoughts on the proposed solutions, but I think this comment is long enough, so I'll let you respond first 😉 .

@rigelrozanski
Copy link
Contributor

decent discussions.

Presently, they are forced to withdraw their rewards to their liquid account - the onValidatorModified hook calls WithdrawValidatorRewardsAll. I think that's unnecessary - we should be able to just update the dist info, right? - maybe we should change this.

Oh yeah, you're correct, looks like we just ended up reusing onValidatorModified for a wide variety of delegation situations - not sure if this should be updated, anyways I opened another issue for this. #2785

Shouldn't it be more or less constant, since proportionally larger validators will propose proportionally more frequently but also accumulate proportionally more accum?

I do believe this is correct

I think this is true if and only if there is no way for a proposer to delay their own slot without also withdrawing rewards. Depending on pending Tendermint changes - tendermint/tendermint#2718 - it may be possible for validators to intentionally delay their slot - but not without triggering withdrawal hooks due to changes in power, so that seems safe.

huh, interesting, yeah so if the validator just kinda skipped proposing so they could hoard global accum they could then move beyond 50% - yeah so the proposer kind of solution requires having regard for proposers being skipped (sounds not to fun to implement)


I have some thoughts on the proposed solutions, but I think this comment is long enough, so I'll let you respond first 😉 .

Thanks! hehehe - cool yeah let's hear your solution ideas! :)

@ValarDragon
Copy link
Contributor

ValarDragon commented Nov 12, 2018

When we decided that we wanted an algorithm with approximations to launch faster, this is the exact trade-off we took. (Random stuff like this, with potentially high impact) I think we should not go through and fix the current algorithm, and instead allow this to exist. I'd rather effort go into building out an approximation-free algorithm than in fixing this.

I do agree that your scenarios work, but I'm not sure its worth further effort in analyzing them versus just switching to an approximation-less algorithm ASAP after launch. This is also the sort of attack that is in the detectable category, so if it does occur, we can handle it accordingly. (EDIT: I do think these are significant problems that must be addressed, but post-launch with the correct approximation-free algorithm)

If we really want a stop-gap, I believe mine fixes alot of this. Namely any power-change should force a withdraw for every validator at once.

@cwgoes
Copy link
Contributor Author

cwgoes commented Nov 13, 2018

I do agree that your scenarios work, but I'm not sure its worth further effort in analyzing them versus just switching to an approximation-less algorithm ASAP after launch. This is also the sort of attack that is in the detectable category, so if it does occur, we can handle it accordingly.

I agree, stop-gap solutions are not that effective and not what we should spend time on. I'm trying to figure out if these issues are concerning enough that they need to be addressed pre-launch (or pre-transfers) or not.

Execution of these kinds of attacks is indeed detectable, albeit not (easily) slashable, and the gains would be proportional to the actual transaction fees, which we can expect to be fairly low for awhile. Given that, doing nothing seems relatively safe, although I'm far from a hundred percent confident in that assessment.

@cwgoes
Copy link
Contributor Author

cwgoes commented Nov 27, 2018

Another issue with this reward distribution mechanism is that re-bonding rewards will be very popular, so tons of transactions will be sent on the network (initially) to do it, rendering the "lazy calculation" less useful.

This is especially likely because it's in any validator's interest to accept low or zero-fee transactions which withdraw rewards (in the first message) and delegate them to that validator (in the second message).

@Hyung-bharvest
Copy link
Contributor

Hyung-bharvest commented Dec 3, 2018

As cwgoes explained, in 9002, rebonding every block became winning strategy because of unfair distribution followed by global pool system.

I imagine, in mainnet, tens of thousands of delegator repeating withdraw and delegation not to lose their share of global pool, results in thousands of withdrawal/delegation txs, costing them lots of tx cost and efforts to automate it.

This will not only be burden to the network, also delegators will be sick of losing their rewards by too frequent withdrawal(but if they dont do it, others will take their shares of global pool = a game theory results in lose-lose situation)
I think valardragon's initial idea(removing global fee pool) is the clearest and most simple solution, but I don't know what is blocking it.

Immediate reward distribution to each validator's pool and getting rid of global fee pool.

Ultimate goal of this issue should be "auto delegation of each block's rewards by default", just like a banking account. Investors and delegator does not expect any periodic withdraw responsibility to get fair reward. They just put the atom in stake, then it should be a snow ball without doing anything. It is very old standard of financial industry. Also, frequent withdraw-delegation needs the private key for delegator to sign, result in too frequent key usage = security problem arise.(consensus brought from Korean delegator community)

So I strongly challenge this issue is "pre-launch" to fix! I even argue it is pre GoS since GoS will be a mess with hundreds of withdraw-delegation spam txs in EVERY block

@mdyring
Copy link

mdyring commented Dec 3, 2018

I agree with @dlguddus.

If we (briefly) ignore the security/fairness implications of what is discussed above, and only consider the usability perspective, it seems intuitively to me that rewards should be autobonded in the same fashion that one receive compounded interest on a bank account.

It seems very unnatural to require delegators - the very people who don't want operational risk - to be running scripts signing with their secret key to reap this benefit.

I understand that there is some additional computational overhead, but if done on a block boundary with a few 100s validators, I guess it would be manageable?

I am guessing that pushing this post launch will just make things trickier, so hoping this will be prioritized pre launch.

@zmanian
Copy link
Member

zmanian commented Dec 3, 2018

Isn't the utility of this strategy of frequent autobonding primarily based on the super high inflation in gaia-9002 and GoS. The much lower inflation on mainnet makes this strategy much less important and the opportunity costs of infrequently rebonding will be small on mainnet.

Maybe we should run a testnet with more normal inflation so the community can get a better feel for this?

@gamarin2
Copy link
Contributor

gamarin2 commented Dec 3, 2018

Maybe we should run a testnet with more normal inflation so the community can get a better feel for this?

The regular testnet that will run parallel to GoS should definitely have normal inflation in my opinion

@mdyring
Copy link

mdyring commented Dec 3, 2018

The much lower inflation on mainnet makes this strategy much less important and the opportunity costs of infrequently rebonding will be small on mainnet.

Agree that the super high inflation for 9002/GoS gives this a lot of visibility.

I am mainly thinking about user/delegator experience and their security here. Using the bank account analogy, I don't think we should require delegators to talk to the bank once a month to receive compounded interest.

I think it would be a shame to leave behind delegators who forget to do this - ideally the software should ensure it happens.

Interestingly there also is some element of "delegator liveness" to this. Not sure it is desirable (but could be?) that delegator inactivity is punished.

@Hyung-bharvest
Copy link
Contributor

Hyung-bharvest commented Dec 3, 2018

delegator needs to think about the optimal withdrawal frequency in a given situation. it is very hard task for normal delegator and there is no need to burden them in the first place. same in GoS too. it is a problem brought by artificial distribution. can I ask again why we can't remove global pool?? why we can go with simpler solution?

@zmanian
Copy link
Member

zmanian commented Dec 3, 2018

It adds a lot of technical complexity to do autobonding and it would delay launch substantially. We can improve this but we removed autobonding from the spec in order to launch faster.

@Hyung-bharvest
Copy link
Contributor

removing the global pool also will cost much time and technical costs too?

@ValarDragon
Copy link
Contributor

ValarDragon commented Dec 3, 2018

Theres two distinct things being discussed. The aim of this issue was to point out the problems that exist in the reward calculation that are inherent to this distribution algorithm. Removal of the global pool mitigates that. (Really it pushes the problem onto the per-validator delegators instead of it being global) That is distinct to auto-bonding which appear to be the source of the last few comments.

My viewpoint on the former is that I would much rather see us focus efforts on building a correct-by-construction fee distribution algorithm. (We have two proposals for this already, F1, and lamborghini) I haven't really thought about auto-bonding. In the F1 spec, its relatively simple to add auto-bonding, and with an increase in storage requirements it can even be made such that you can toggle between auto-bond or not per delegator. In the current piggy bank distribution, I'm not sure if auto-bonding is clean to add. (I haven't put any thought into it)

However I also think that this is a genuine non-issue for the first 3 months of the network operation. Under a normal inflation rate, this has very low impact over the inflation one receives. Furthermore, at the end of the day this is just manipulating the system to get a greater percentage of fees. I am convinced that fees are going to be negligible for the first few months of the networks operation. There are no transfers for like a month after network launch. Our short block times enable more throughput further driving down fees. Our mempool doesn't even sort by fee, so despite the expectation, higher fees don't get your tx included sooner...
We can just fix fee distribution in a later fork.

@Hyung-bharvest
Copy link
Contributor

Hyung-bharvest commented Dec 4, 2018

just one more add to a possible problem : because of gas costs and decimal truncation, the cost of frequent withdraw is relatively cheaper for the rich delegators than poor delegators, result in unfair reward return. logic will widen the gap between rich and poor.

i argued this is pre GoS problem because the algorithm which dynamically choose its rebond timing will do very crtitical role to win the game because of the inflation rate.

but the high inflation rate was intended to emphasize uptime, it is possibly result in the emphasize in accurate design of such algorithm because of current distribution logic.

I think the ability to analyze and optimize the timing is one of the ability validator should have, but it is not all. And the effect of such algorithm is actually magnified by fake environment with problematic logic and artificial inflation, so researching this is not related to real world problems in any mean.

@cwgoes cwgoes mentioned this issue Dec 14, 2018
5 tasks
@cwgoes
Copy link
Contributor Author

cwgoes commented Jan 31, 2019

I believe this entire class of concerns is no longer relevant as we have merged F1, which calculates exact rewards and contains no forced-withdraws.

Please reopen if anything discussed here still applies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:x/distribution distribution module related T: Security
Projects
None yet
Development

No branches or pull requests

7 participants