-
Notifications
You must be signed in to change notification settings - Fork 484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducing GitHub API calls to scale scanning repositories #202
Comments
Also, this approach could be used to Scan not GitHub repositories like |
Cloning might be too heavy weight for some big repos, and slow too. Maybe lets start with httpcache first. We can also scale the number of github tokens. right now, we have 2, we can easily go into 4-5 (seperated with comma). |
Cloning can be async as another cron job and it is a one-time effort. Cloning should not be run as part of scorecard , probably give an additional option to look at a location for cached git repo's if not fetch them from github.com. |
Makes sense. |
Closing this since we are already tracking this here: #318 |
The GitHub API calls are throttled which makes it hard to scale the number of repositories to scan and provide results.
The code would have to wait for tens of minutes before continuing
{"level":"warn","ts":1613869247.8747272,"caller":"roundtripper/roundtripper.go:139","msg":"Rate limit exceeded. Waiting 44m34.125286853s to retry..."}
Scorecard checks for these don't need GitHub API, it requires a Git API
Potential solution
cron
- to get the updatesThe https://github.com/go-git/go-git project provides an API on Git which could be used for avoiding the GitHub API limitations.
With
httpcache
#80 (comment) and reducing the number of GitHub API calls, we should be able to scale the scanning number of repositoreis.related to #80
The text was updated successfully, but these errors were encountered: