Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add retries to GitHub client for GetModifiedFiles on 404 #2013

Conversation

elementalvoid
Copy link

This is a follow on to resolve similar issues to #1019.

In #1131 retries were added to GetPullRequest. And in #1810 a backoff was included.

However, those only resolve one potential request at the very beginning of a PR creation. The other request that happens early on during auto-plan is one to ListFiles to detect the modified files. This too can sometimes result in a 404 due to async updates on the GitHub side.


My team recently upgraded a few Atlantis instances that were pretty old. They didn't yet include the fixes described above.

We upgraded to v0.18.1. After upgrading we were hopeful our dev teams would be happy to know these race condition errors were a thing of the past. But in only a couple of days, we got another report!

I was able to find an error log with the following message (org/repo/pr-number redacted):

GET https://api.github.com/repos/{owner}/{repo}/pulls/{number}/files?per_page=300: 404 Not Found []

And the following stacktrace:

github.com/runatlantis/atlantis/server/events.(*PullUpdater).updatePull
	github.com/runatlantis/atlantis/server/events/pull_updater.go:14
github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).runAutoplan
	github.com/runatlantis/atlantis/server/events/plan_command_runner.go:77
github.com/runatlantis/atlantis/server/events.(*PlanCommandRunner).Run
	github.com/runatlantis/atlantis/server/events/plan_command_runner.go:221
github.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunAutoplanCommand
	github.com/runatlantis/atlantis/server/events/command_runner.go:163

The fixes for #1019 added retries for the /repos/{owner}/{repo}/pulls API but not for the /repos/{owner}/{repo}/pulls/{number}/files API.


Extra updates:

  • Update a variable name in NewGithubClient: It was named transport but it was not a Transport type. It's really an *http.Client configured with the required GitHub credentials.
  • Simplified the ListOptions and nextPage logic. All ListFiles calls now include the page number.
    • This enabled updating the debug logs to include a page number. It also means that when an error like this crops up again we'll know which page the error happened on.

This is a follow on to resolve similar issues to runatlantis#1019.

In runatlantis#1131 retries were added to GetPullRequest. And in runatlantis#1810 a backoff
was included.

However, those only resolve one potential request at the very
beginning of a PR creation. The other request that happens early on
during auto-plan is one to ListFiles to detect the modified files. This
too can sometimes result in a 404 due to async updates on the GitHub
side.
@elementalvoid elementalvoid requested a review from a team as a code owner January 21, 2022 06:12
@chenrui333 chenrui333 changed the title Add retries to GitHub client for GetModifiedFiles on 404 fix: add retries to GitHub client for GetModifiedFiles on 404 Jan 21, 2022
@elementalvoid
Copy link
Author

elementalvoid commented Jan 21, 2022

Oops. Looks like @iainlane beat me to it with #2002.

Feel free to close as desired!

@chenrui333
Copy link
Member

@elementalvoid I am closing this issue in favor of #2002, feel free to raise a new one if you see fit.

@chenrui333 chenrui333 closed this Jan 25, 2022
@elementalvoid elementalvoid deleted the retry-github-get-modified-files branch January 25, 2022 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants