-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop accepting bundle produced by operator who has previously produced invalid bundle #2316
Comments
I think the root of this issue stems from the initial divergence of mental models here. In the current model, fraud proofs are viewed as a way to signal bad ERs. However, the way Chen was initially proposing was for fraud proofs to signal invalid bundles, and a bundle in turn could be invalid because of bad extrinsics or because of bad ER fields. The set of fraud proofs we have for |
I agree with the divergence and we have outdated TODOs for this but even if the fraud proof is submitted, there could be multiple bundles coming from the operator before fraud proof is submitted. But on the other hand, not letting the bundle author submit more bundles on runtime seems more straight forward. |
But we can't have consensus on whether |
Yeah, it is, but it won't affect the honest operator much. When a bad ER with fraudulent |
I see, so the expected scenarios are:
|
Yeap. That looks right to me |
@dariolina @NingLin-P do we have a quorum here on the approach of not letting operators produce bundles once they produce invalid bundle. If accusation was wrong, they should be able to produce bundles once FruadProof is in for such bad accusation |
Looks good to me in general.
Is this at the end of epoch? |
Yeah, so in most cases where the honest operator is online and fraud proof will be submitted within a few blocks, the false accusation will be removed before this happen.
In this case, we can force epoch transition when fraud proof is submitted, so the honest operator won't need to wait and it doesn't affect at all from the bundle production perspective. |
This is how I'm thinking implementation would look like
I dont think we should remove the operator from the staking since fraud proof may be delayed before epoch is finished. If we remove them before fraud proof comes in, then, like you said, they would need to wait until next epoch is complete which is not fair yes and I can see malicious operators marking bundles invalid right before epoch completion to kick honest operators out of the set. If the operator was indeed malicious, then they wont produce any bundles. Even they try to deregister, they will be slashed once ER is confirmed. Does this make sense @dariolina |
If they are not removed from staking, it will lower the bundle probability for other honest operators.
We can detect such situations in consensus runtime and force epoch transition immediately once fraud proof is submitted. |
Sure and bundle production will slow down but this is not permanent and they will be removed once the ER is confirmed. I fail to see why this is an issue here
I would prefer not doing such forced transitions by runtime itself. Forced epoch transitions are meant for the Sudo or Governance when there is no way to recover such domain. This case is not warranted for such forced epoch transition. |
Depending on the percentage of stake of the operator who produces invalid bundle, if it is 50% the bundle production is slow by half, if it is 90% production slow by 10 times. While the honest operator can detect that within a few blocks we need to wait a whole challenge period pass (i.e. 1d)
That is the current usage not the fundamental limit of force epoch transition
Indeed, this will increase the weight of the
|
Sure but this will limit the slowness to just that particular domain and consensus does not need to do any extra compute. We generally prefer not making the consensus do extra work.
This would be a scenario that might not be practical for all the domains. If there are some domains that has the limited operator set, then be it that that particular domain will have slow block production because there is a malicious operator with huge stake.
I would have to disagree with you here. Though it may not be a fundamental limit, this will be an abuse of doing such epoch transitions when there is no such reason to do so.
Overall, this seems more like an optimisation that can come in if we have a efficient way to do such epoch transitions. Though I would be hesistant to do such things by runtime and should only come from Sudo or Governance. |
Wasn't it already an issue?
What we do just to repair the damage made by the malicious operator, through both prune bad ER and epoch transition, I don't think we should count that as "extra" and "no reason". |
Not ideally. We are talking about specific scenario where a malicious operator had a huge amount of stake for a given domain. In practical scenraio, maybe I'm not seeing everything here, from the example taken from the Gemini-3g, this should not lead to slow bundle production IMHO given a sufficient number of operators for that domain.
This should not be the task of Consensus chain, IMHO. Though I dont have strong reason against what you are proposing, there are many ways to make consensus do extra work to clear up such damage made by bad actors. |
I would avoid using force epoch transition, please. |
I'm okay with avoiding epoch transition since it makes implementation easier and won't make things worse.
The divergence happens because making fraud proof signal bundle (and covering ER through bundle) will bring much more significant change in many aspects of domain (rather than just fraud proof) that are not mentioned in the design doc. And IIRC from the original design doc, the reason for signal bundle is something like "bundle submitted with bad ER is produced by the malicious operator so should not be handled" which is not a strong statement and not really correct since the honest operator can validate every tx in the bundle before further handling. Thus we chose the more straightforward and intuitive approach to signal bad ER, but now with all the changes/infra made, I do see feasibility in that approach, which may also help #1673, I will open another issue with more detail for discussion. |
Currently, the honest operator only submits fraud poof if it finds a mismatch between the ER stored onchain and its local ER. For malicious operator who produce an invalid bundle, it will be marked on the ER and slashed once the ER is confirmed thus the slashing is deferred.
This can be an issue because the malicious operator can still produce bundles before it get slash, if it continue to produce more invalid bundle and it may waste more block space than it can afford since the
MinOperatorStake
is set to2*Block
while it can waste up tochallenge_period * Block
.This can be fixed by introducing a new status
PendingSlash(domain_block_number)
to the operator:PendingSlash
once an ER is submitted and it claims that the operator has produces an invalid bundledomain_block_number
insidePendingSlash(domain_block_number)
is the domain block number of the ERtotal_stake
while everything else should still keep until the ER is confirmedPendingSlash
operatorbad_receipt.domain_block_number <= PendingSlash.domain_block_number
thePendingSlash
status should be reverted and the operator's stake should be added to the domain'stotal_stake
cc @dariolina @vedhavyas
The text was updated successfully, but these errors were encountered: