Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Splitting the H/M rewards between QA and the bots in case no H/M has been found #129

Open
twicek opened this issue Oct 31, 2023 · 8 comments

Comments

@twicek
Copy link

twicek commented Oct 31, 2023

Currently, if no H/M has been found during a contest, the full H/M pot is distributed to QA reports.
People that contribute the most to the security of a codebase are obviously those who look thoroughly in the code. The bot reports are helping them to this end by ruling out generic issues, allowing them to focus on the more deeply hidden vulnerabilities.
Additionally, the bot crew members that are maintaining and improving their bots are contributing to more contest instances with no H/M issues. In the future, bots ability to find complex vulnerabilities will increase, therefore it would make sense to allocate half of the H/M to bots if no H/M has been found. It would also increase incentives for bots to add H/M to their bot report.

50% might not be perfect, but some of the H/M pot should definitely be distributed to bots in this situation.

@GalloDaSballo
Copy link

The above assumes that the time that people have spent, while finding nothing, is worth zero

While the rewards curve will pay zero, that is not the case

@twicek
Copy link
Author

twicek commented Oct 31, 2023

Wardens who reviewed the code without having their QA report accepted might have added more value than those who built sufficient quality QA report and will still receive no rewards. If the objective is to rewards wardens that actually reviewed the code despite finding nothing, the way to do it is not by solely rewarding QA reports in my opinion.

My point is that having a sufficient quality QA report is not entirely correlated to the value added by the warden when looking at the code. Arguably, it could even be inversely correlated because time is allocated differently and grades have too much emphasis on the number of issues on the report. In my opinion, the way QA reports are judged is way too harsh on reports with a low amount of high quality remarks / low severity issues. There is a high incentive to submit as many issues as possible, meaning that someone that is not confident to find any real bug in the codebase is more likely to use all his time to look for QA issues and get his QA report accepted than someone that used all his time to find H/M and ended up finding less QA issues but still probably more interesting information (that will never get to the sponsor if his QA report is invalidated for being too short, sadly).

On the other hand, the bot crew might not use a lot of time during a specific contest, but adds value before the contest by neutralizing H/M issues and doing so, participate to increase the number of contests with no issues.

In short:

  1. Having a well graded QA report is not always correlated to having looked at the code thoroughly.
  2. Bots are the main reason contests with no H/M issues will be increasingly frequent, and yet, rewards are fully given to QA reports which should not be the case, taking 1. into account.

@twicek twicek closed this as completed Oct 31, 2023
@twicek twicek reopened this Oct 31, 2023
@alexbabits
Copy link

Additional ideas for changing payment distribution when no HM is found

Currently 100% of HM rewards get transferred down to the QA reports: "In the unlikely event that zero high- or medium-risk vulnerabilities are found, the HM award pool will be divided based on the QA Report curve." - https://docs.code4rena.com/awarding/incentive-model-and-awards

Idea 1

Giving other sections a higher split: Because all issues found in the QA are classified as low or lower, it may make sense that the value provided to the protocol via the bot findings, gas optimization suggestions, and analysis reports (high level overview) may be on a similar "tier" to the low findings in the QA. Therefore, an increase in awards could make sense for the gas optimizoooors, and/or bot reports, and/or analysis reports when no HM are found.

Idea 2

Keeping it as it is: A finding can be low because the severity is high but it’s very unlikely to happen. This should still be classified as a "tier above" all the gas/bot/analysis value provided, since there is technically real tangible potential security issue. This argues there is a tangible tier difference between a "real" low and then any gas/info/nc/overview stuff. This philosophy is more akin to the situations where a protocol has 1 medium total that is awarded for the whole HM split, which makes sense to me atleast, since that is "a tier above" everything else in terms of tangible security issues.

Idea 3 (from sockdrawer C4 Staff)

Fine-grained "per-low" payments: "...as protocols become more secure in the future (as is the goal!) there should hopefully be a lot more cases where more nuance in how awards are distributed is required. ...I haven't documented this thought, but the way I have believed this ultimately will go is rather than just stack ranking QA, in the event of no HMs, we split out QA to be judged as individual lows like we used to at the very beginning of C4."

Idea 4

It's also worth considering if the "static" sectors like Judge, Lookout, & Scout should be changed when no HM is found. Just throwing this one out there.

Notes

I think an important thing is to crystalize C4's philosophy of payments for the "minor" sections.

Why are they paying wardens? Payments based on participation versus payments based on direct value given to protocol? What kind of balance should be best, if any, and why?

Currently in each "minor" section, it looks like only 1 report is given to the protocol (QA, Gas, Bot, Analysis) in the final report. What weight should each of these sections have in the case of no HM, and why? And within each section do the current weights make sense?

  • QA reports offer valid lows for the protocol to consider/fix. This directly helps the protocol.
  • Bot reports help get all the common standard issues publicized quickly, which eases the judges job by not having to look through standardized issues later on. This also helps wardens not having to worry about all the standardized issues. This also helps the protocol with any standardized issues they want to deal with.
  • Gas reports help the protocols who are eager to save and optimize gas.
  • Analysis reports help showcase a deep understanding of the protocol, providing a high level overview of the security, risks, tips, thoughts, and structure of the protocol, which can be useful for the developers/users/builders in the protocol's ecosystem moving forward after the audit.

@GalloDaSballo
Copy link

My perspective, which is consistent with my comment above, is that as a project, we pay for C4, because we want the highest level of scrutiny.

As a project, we want to have as many wardens as possible, spend as much time and try as varied attacks as possible to our codebase.

Ideally we'd want all wardens to participate for 100% of the time.
Changing the reward incentives will result in a lower EV for wardens, which will result in them spending less time on each contest

As a Warden, it is very clear to me that when a codebase has few bugs, then a single bug could result in an high reward.
And this is a high risk, high reward scenario.
Some times, this has resulted in "scam" findings taking 100% of the pot
Some times, this has resulted in QA reports taking 100% of the pot

The rule of splitting QA awards when no H/M is found is a last line of defense to ensure some level of EV to wardens, to justify them spending extra time to attempt as complex attacks as possible

Removing that is directly incentivizing a smash and grab style, in which we would all focus on finding a few findings, and going to the next contest.

Asserting that a bot has "cleaned up" the codebase from valid findings is a comment that as of today is myopic to be gentle, considering the vast amount of time that is wasted in arguing whether a finding is in scope or not if a pre-requisite was flagged by bots

But most importantly, completely missed the point that sponsors are not paying for the bugs, they are paying for the process that makes bugs found.

Bugs can be found by bot racers, but they do not apply the same process, and as such they are not awarded the same prize

@IllIllI000
Copy link

IllIllI000 commented Jan 6, 2024

  1. Aren't the bots preventing a big chunk of the scam findings, making the single bug even more lucrative?
  2. How does including bots in the payout when there are no H/M findings affect the 'last line of defense'? Doesn't it make wardens spend more extra time, since they don't have to report junk, and can focus on higher value stuff? I think a person is more likely to stop early, knowing that there's a QA split and that they've submitted a QA report, than if the pot is split between all participants (which would make each person get less).

I recently got a large payout from a contest where I didn't submit anything for the bot race, but submitted my bot's (filtered) output, and that contest just happened to have no H/M findings. I believe I got 3x more than the winning bot for submitting content as a QA report. It's a completely arbitrary distinction given that the content submitted for each is the same. Excluding bots will just lead to things like them saving findings for the qa reports 'just in case'

@GalloDaSballo
Copy link

Bots being added changes the EV of wardens, it's not arbitrary, it's advantageous to a small group, that is capped to 20 people and that you belong in.

It's giving away extra EV for people that have to contribute no additional work.

Conflating a finding with value offered to sponsors is the fundamental problem with all of this logic.

Sponsors are paying for people to check the code.
Findings are how the organizations pays out rewards.

Those are the distinct concepts and they are only partially aligned towards the interest of the sponsor.


Per your own example, you have done no additional work that is unique to the codebase

If every warden behaved in the same way, and sponsors had a way to predict that behaviour, no sponsor would ever pay for a contest

@IllIllI000
Copy link

I agree that it benefits me, but my point was that lower payouts for everyone if a QA split occurs, is better, because people will spend more time looking if they know they won't get much otherwise. Hoping to get a QA split with manual QA report submissions is not a good EV use of time. You didn't address that aspect.

Nowhere here have I (or I believe anyone) claimed that the individual bot findings are valuable to sponsors. What we're saying is that the exclusion of findings from what wardens can submit, is in and of itself valuable, because it shrinks the pool of possible findings, increasing payouts for all remaining findings. If that's a valuable service (it's C4's stated goal of the bot races), and there is a disincentive for bots to provide that service in the form of a distraction by a possible QA split, then shouldn't that disincentive be fixed somehow?

As twicek pointed out, the people that submit long QA reports generally aren't the ones finding the extra H/Ms, and yet they get most of the reward when the pot is split among QA. If your argument were that only downgraded M->L findings should get the split (provided there were no gaming of that mechanic), I'd agree that bots shouldn't be included. But right now QA is still the same as bot reports, and a lot of times the QA winner hasn't done any work that's unique to the codebase either (aside from manually doing what bots are doing).

@ryanjshaw
Copy link

Is the awards data available somewhere in an easy to query format?

A useful data point for this discussion would be:

to what extent do (the people who have found M and H) overlap with (people who've submitted QA when no M or H was found)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants