Proposal: GitHub self-hosted runners for specific use cases #1162

dmathieu · 2022-09-09T08:32:56Z

The OpenTelemetry Go SIG has benchmark tests that we currently run manually when need, but would like to run automatically.
We used to run those benchmark tests using GH Actions. But an action running on GitHub's infrastructure often has noisy neighbours, making the tests unreliable. So we disabled them.

Discussing with @TylerHelmuth, I (and him) got access to the CNCF's equinix account, which would allow us to boot dedicated virtual machines.
Our wish would be to be able to run our benchmark tests in self-hosted runners.

The easiest way to setup those runners would be to do it manually for the repository. This seems brittle though, and prone to mistakes.
We could also setup terraform (or any other automation system) to auto-provision the runners. AFAIK, there is no such automation within the otel organization at the moment though.

It also seems like setting up self-hosted runner just for the Go repository is a bit overkill, as we would be under-using them. Other SIGs may have similar needs though. So having them at org-level may be better?

So here come this issue to gather feedback and opinions about this need.

dmathieu · 2022-09-09T08:33:03Z

cc @jmacd

jmacd · 2022-09-21T22:31:03Z

The technical committee discussed the risks and benefits of adopting self-hosted runners. Here are some of the points discussed:

If possible, benchmarking using CPU performance counters can overcome the problem of noise due to shared-resource contention. (Note that this is not currently built-in to the Golang benchmarks suite to help in OTel-Go.)
There is a definite use-case for self-hosted runners that involves correctness testing, particularly for resource detectors that need to be run in actual Cloud-vendor environments (e.g., testing AWS EC2 resource detection on an AWS EC2 machine).
Committee worries about security (e.g., mis-use of the resource)

Our overall recommendation is to see if we can avoid this additional configuration and management associated with self-hosted runners; if there's sufficient interest for correctness integration tests that may be more compelling; perhaps the OTel-Go team can find a workaround?; perhaps the engineering time would be better spent manually reviewing benchmark results prior to releases?

aktech · 2022-10-10T23:21:04Z

By the way: cirun.io does the same, without adding maintenance burden.

bobstrecansky · 2023-02-28T13:18:47Z

There are multiple other services that provide value for this:
buildjet.com is another example.

wanted to resurrect this proposal - It'd definitely help with velocity and stability if we had consistent build servers for our projects.

trask · 2023-03-01T19:44:44Z

I've opened a CNCF ticket to ask just in case they already have some ARM64 runners available at the CNCF GitHub Enterprise level that we can use.

trask · 2023-03-01T23:45:32Z

the CNCF pointed to setting up self-hosted ARM64 runners using https://github.com/cncf/cluster (which I realized afterwards @dmathieu mentioned above when opening this issue).

pulling down the TC recommendation from above #1162 (comment):

Our overall recommendation is to see if we can avoid this additional configuration and management associated with self-hosted runners; if there's sufficient interest for correctness integration tests that may be more compelling; perhaps the OTel-Go team can find a workaround?; perhaps the engineering time would be better spent manually reviewing benchmark results prior to releases?

@bobstrecansky is there something in PHP repo that needs special care around ARM64 testing? have you seen issues with things not working or breaking on ARM64? or is the desire for automated ARM64 testing more out of an "abundance of caution"?

bobstrecansky · 2023-03-01T23:58:46Z

@trask more of the latter - ARM support for our testing matrix would be a welcome addition. I'm sure other SIGs would probably like to have that as well.

Kielek · 2023-03-02T05:36:46Z

It will be great to have it also for .NET AutoInstrumentation. See: open-telemetry/opentelemetry-dotnet-instrumentation#1865

bobstrecansky · 2023-03-09T17:47:25Z

@trask - also - I don't think the github runners are open source? That may be a loose requirement:
https://github.com/cncf/cluster#usage-guidelines

trask · 2023-07-06T00:53:56Z

Watching and hoping... actions/runner-images#5631

tylerbenson · 2023-09-27T18:46:32Z

We've made some progress and now have a runner that can be used for benchmarks:
#1662

(Note: Permission must be granted for each workflow individually to avoid abuse.)

trask · 2024-02-06T04:31:18Z

FYI we now have access to Arm GitHub runners, see #1821, and you can open a repo maintenance request to get access

@dmathieu will that resolve this issue?

dmathieu · 2024-02-07T08:24:30Z

Thank you @trask. Yes, this should be what we need. We'll be looking into it.
In the mean time, I do believe this issue can be closed.

bobstrecansky mentioned this issue Feb 28, 2023

Test builds for ARM open-telemetry/opentelemetry-php#904

Closed

trask mentioned this issue Jan 25, 2024

REQUEST: Adding a github application for self-actuated #1821

Closed

trask added the needs author feedback label Feb 6, 2024

dmathieu closed this as completed Feb 7, 2024

github-actions bot removed the needs author feedback label Feb 7, 2024

dmathieu mentioned this issue Feb 19, 2024

REQUEST: Repository maintenance on opentelemetry-go #1954

Closed

rrschulze mentioned this issue Jun 2, 2024

Add self-hosted GitHub Actions runner for linux/s390x #2135

Open

lalitb mentioned this issue Jun 12, 2024

REQUEST: Repository maintenance on opentelemetry-rust #2150

Closed

dmathieu mentioned this issue Jun 26, 2024

Add continuous benchmarks open-telemetry/opentelemetry-go#4537

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: GitHub self-hosted runners for specific use cases #1162

Proposal: GitHub self-hosted runners for specific use cases #1162

dmathieu commented Sep 9, 2022

dmathieu commented Sep 9, 2022

jmacd commented Sep 21, 2022

aktech commented Oct 10, 2022

bobstrecansky commented Feb 28, 2023

trask commented Mar 1, 2023

trask commented Mar 1, 2023

bobstrecansky commented Mar 1, 2023

Kielek commented Mar 2, 2023

bobstrecansky commented Mar 9, 2023

trask commented Jul 6, 2023

tylerbenson commented Sep 27, 2023

trask commented Feb 6, 2024

dmathieu commented Feb 7, 2024

Proposal: GitHub self-hosted runners for specific use cases #1162

Proposal: GitHub self-hosted runners for specific use cases #1162

Comments

dmathieu commented Sep 9, 2022

dmathieu commented Sep 9, 2022

jmacd commented Sep 21, 2022

aktech commented Oct 10, 2022

bobstrecansky commented Feb 28, 2023

trask commented Mar 1, 2023

trask commented Mar 1, 2023

bobstrecansky commented Mar 1, 2023

Kielek commented Mar 2, 2023

bobstrecansky commented Mar 9, 2023

trask commented Jul 6, 2023

tylerbenson commented Sep 27, 2023

trask commented Feb 6, 2024

dmathieu commented Feb 7, 2024