-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runner Observability #1116
Comments
We are looking for similar features, but we would like not to code it ourselves. I think this could be solved nicely by GA provided dashboards (read only accesible by all devs) that provide statistics for runners, workflows, label bottlenecks, load balancing, and so on. Right now we work around this in two ways (wip):
|
Other metrics that would make sense is for instance queue length. In gitlab land there are excellent ways of getting observability out of the runner via prometheus exporters. I wish the github runner took a similar approach. |
https://github.com/Spendesk/github-actions-exporter found this, thought I'd post it on this issue as I think people will find it useful. I haven't tested it personally but it implements prometheus exporters for data you can get from the API covering some of the stuff you would want to be tracking providing some much needed observability (I wish these statistics were just baked into the github.com UI offering though!). One of the big limitations with this approach is no observability at the step level. If builds are taking longer is that because there is a problem or is it because we aren't hitting the cache as often? For example |
We recently published an ADR for Job Started / Job Completed hooks for self hosted runners, feel free to provide your feedback. In particular we would love to hear what (if anything ) else you would need to support your use case, and if the interface makes sense for you. |
We've shipped a beta of this functionality in |
Prerequisites
Nature of problem
Assuming you have (like us) over 100 developers, dozens or hundreds of workflows. All share the same self-hosted runner(s).
You have no oversight, who highjacks the runners. Highjack means hogging any form of resouce:
Describe the enhancement
The cleanest enhancement would be a form of extension hooks. Upon job start a hook in some form gets called, within this hook you could then define your own actions. Maybe something in style of Swizzling where the native hook does nothing (or a console log) while you can swizzle the component to add your own action.
Upon completion another hook gets called with which you can then complete your observability.
Code Snippet
Some pseudo code. Given that the runner is .NET code it would not look like that, I just come from the TS world.
Additional information
It might be that this concept already exists, but then its just not documented or not findable.
Also, I once saw a
/timing
API but I can not find it anymore, seems to have been removed.Clearly, when enterprises start to adopt Actions the demand for observability will raise. Are we alone? 🛸
The text was updated successfully, but these errors were encountered: