Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sometimes owl bot doesn't run #2204

Closed
SurferJeffAtGoogle opened this issue Jun 29, 2021 · 8 comments · Fixed by #2228
Closed

sometimes owl bot doesn't run #2204

SurferJeffAtGoogle opened this issue Jun 29, 2021 · 8 comments · Fixed by #2228
Assignees
Labels
bot: owl-bot priority: p2 Moderately-important priority. Fix may not be included in next release.

Comments

@SurferJeffAtGoogle
Copy link
Contributor

On bilge pump duty this morning, I've already seen 5 pull requests where owl bot never ran, so I couldn't merge them.

For example: googleapis/nodejs-service-management#63

These pull requests would already have merged if not for owl bot, so this is slowing us down.

@SurferJeffAtGoogle SurferJeffAtGoogle added priority: p2 Moderately-important priority. Fix may not be included in next release. bot: owl-bot labels Jun 29, 2021
@tmatsuo
Copy link
Contributor

tmatsuo commented Jun 29, 2021

Do we know why? Were there a burst of webhook requests at the time when the PR is created?

@SurferJeffAtGoogle
Copy link
Contributor Author

@SurferJeffAtGoogle
Copy link
Contributor Author

@bcoe has some ideas why.

@SurferJeffAtGoogle
Copy link
Contributor Author

@bcoe
Copy link
Contributor

bcoe commented Jun 30, 2021

So far, whenever I've dug into these issues, I've found a corresponding set of timeouts on GitHub WebHooks as described here:

#2114

I think once we address #2114, we might notice that failures to launch the post processor are rare.

@SurferJeffAtGoogle
Copy link
Contributor Author

@SurferJeffAtGoogle
Copy link
Contributor Author

@bcoe
Copy link
Contributor

bcoe commented Jul 2, 2021

@SurferJeffAtGoogle we found a bug on some of the repos you linked, which was us failing to report a failed status check when config was invalid (we had slightly off config in some repos).

I believe the other failures to run were a result of us missing Webhooks when traffic is bursty, @tmatsuo has addressed this by splitting apart our frontend for OwlBot from our backend job processing. We've tested and have been able to handle 80 jobs concurrently, without any failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot: owl-bot priority: p2 Moderately-important priority. Fix may not be included in next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants