Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GH rate limit issues experienced by contributors forking open source repos that have Codecov Installed #1574

Open
Tracked by #1496
rohan-at-sentry opened this issue Apr 15, 2024 · 6 comments
Assignees
Milestone

Comments

@rohan-at-sentry
Copy link

rohan-at-sentry commented Apr 15, 2024

The issue is manifested in "tokenless" flow that was introduced to support v4. It would appear we're performing checks to verify provenance which doesn't afford that much protection.

ALso, explore making token requirement for open source repos optional to alleviate challenges to set up Codecov. However, this needs some level of privilege so a bad actor cannot simply make it a requirement.

@rohan-at-sentry
Copy link
Author

@rohan-at-sentry
Copy link
Author

rohan-at-sentry commented Apr 16, 2024

Path forward

Step 1

Solving the GitHub rate limit issue

Tokenless authentication performs two GitHub API requests:

These checks don’t afford much protection. Neither matters for private repos because tokens are required unconditionally. For public repos, the first can be satisfied by scraping CI history or just opening a PR and the second doesn’t really prevent anything in particular.

Both checks can be removed if we rework the logic of tokenless/token auth:

  • If a repo is private, require a token
  • If a repo is public, and the branch name begins with a prefix (e.g. forkname:main), don’t require a token. Uploading for forks is the Wild West.
    • codecov-cli already detects if it is uploading for a fork and munges the branch name accordingly
  • If a repo is public, and the branch name does not begin with a prefix (e.g. main), require a token. Forks/trolls should not be able to clobber your project’s authoritative coverage data.

Under this scheme, branches from forks (forkname:*) are not authenticated and can be spoofed, but branches from the upstream repo (main) are protected.

We need to implement this logic on both upload endpoints to solve the problem. These three bullet points should each have their own unit test.

Step 2

Making tokens optional without undermining their security

By default, tokens should not be required for public repos so that users who relied on the authentication of the old upload endpoint are able to upgrade their GH Action and onboard to the CLI. Some public repos will want to require tokens for their repos, and we should build a way for them to opt themselves into that requirement.

Enabling/disabling token requirements needs to be a privileged operation so that a bad actor can’t just disable it to troll. This means it can’t be done with a CLI switch or in a repo’s codecov.yml. Instead, we should build the ability in Gazebo for an admin to enable/disable a token requirement.

Phase 1

  • Create a technical design for supporting repo level authentication (i.e. ensure an authorized/designated user can access a specific page - settings for example)

Phase 2

  • We already have an org-level config page today, so we could add an org-level switch that affects all repositories. SWAG estimate: 1w for apps team. 1w for platform to wire it up.
  • In many cases, an org has tons of repos which are effectively owned by individuals (e.g. projects under the apache umbrella) and an org-wide setting won’t be workable. For these cases, we will need to build out:
    • the concept of, and UI for, repo-level admins in Codecov’s permission model. SWAG estimate: 3w for platform, 3w for apps team.
    • UI for repo-level configuration + a “require tokens” setting with options “yes”, “no”, and “inherit”. SWAG estimate: 3w for apps team, 1w for platform to wire it up.

@rohan-at-sentry
Copy link
Author

rohan-at-sentry commented Apr 16, 2024

Expected Upload experience from Forks as a result of Step 1

  • Largely unchanged from v4, except uploads from tokenless will not trigger API checks. This should reduce instances of these uploads failing due to rate limit errors from Codecov.

@matt-codecov
Copy link

a couple implementation details for Step 1 that i am dumping here:

  • node uploader does not do branch-name munging. consider adding the forkname:branchname behavior there
  • codecov-cli needs to send the branch name on create-commit, create-report, and do-upload. it only does it on one or two right now
  • if we change when a token is/isn't required on the old upload endpoint at all, do it behind a Feature so we can insta-disable or selectively deploy

if this is put on anyone else's plate, happy to help them ramp up / write up more organized technical steps

@matt-codecov
Copy link

https://github.com/codecov/worker/blob/02be4f56149316f6be42fdbfa79d162dd32dd4d3/services/repository.py#L253-L258 leaving a code pointer here, this logic in worker will do forkname:branchname munging. i don't know if it will double-apply the munging to branches that had their branchname munged by the CLI already

@rohan-at-sentry
Copy link
Author

folks at codecov/feedback#358 are facing issues with forks being rate limited

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants