Don't Remove Support to `--once` #1339

t0rr3sp3dr0 · 2021-09-14T22:26:55Z

On PR #660, the --ephemeral flag was introduced with the intent of replacing the --once flag. But by deregistering and reregistering the runner every time, the number of API calls to GitHub dramatically increases, and eventually, the rate limit is exceeded.

I experienced before the --ephemeral even existed. Every time I wanted to run a job, I was registering a new runner, starting it with --once, and deregistering it after the job was completed. Quickly the maximum number of API calls allowed to a GitHub App was exceeded. I solve the problem by keeping the credentials of runners and only registering and deregistering runners on upscale and downscale operations.

With the removal of the --once flag, it will not possible to run a single job on the runner without removing it when the job is completed. Effectively limiting the number of ephemeral runners that can be used within an hour. Using a PAT, you are limited to 2500 jobs run by a self-hosted runner, assuming the only operations you do are runner registration and deregistration. With a GitHub App, in the best case, you will be able to run 6250 jobs, or 7500 with GitHub Enterprise Cloud, which is not reasonable for a lot of companies.

@TingluoHuang @pje

The text was updated successfully, but these errors were encountered:

TingluoHuang · 2021-09-15T18:22:09Z

With --ephemeral the deregistering of the runner is done by the service and not counted as part of your API rate limit.

t0rr3sp3dr0 · 2021-09-15T19:17:36Z

Ok, then we have double the number of jobs that I estimated. But even then you are limiting the number of jobs that can be run by self-hosted runners within an hour.

There's also another problem to consider. Even if you have enough number of operations, registering multiple runners at the same time triggers the secondary rate limit. This was also a problem that I faced.

Unless registration of runners are exempt from rate limits, including secondary ones, I think we shouldn't remove this feature.

Also I'm not sure that's interesting for GitHub, as you're making every user of this feature increase their number of API calls towards your severs.

thboop · 2021-09-15T19:30:27Z

Hey @t0rr3sp3dr0

The runner registration token is reusable for up to an hour, are you able to cache and reuse them? If so, it should be much easier to keep within api limits. The runner registration only uses a single api limit per runner (so 1 per job).

t0rr3sp3dr0 · 2021-09-16T04:58:10Z

Yes, I can definitely cache the registration token. Then I would be spending one API operation per repository, as I register runners on repositories and not on the organization.

The limit now becomes very reasonable. That would only be a problem if I had over 10k repositories, but I don't even have 1k repositories.

t0rr3sp3dr0 · 2021-09-16T05:21:22Z

Actually, I noticed another problem. Maybe you have a suggestion for that as well.

The registration token cache works on our scenario because we run our Linux runners on Kubernetes and we have an operator that deals with everything.

But we also have a fleet of macOS runners and in this case we need to cleanup the environment between jobs. So we start macOS with the runner using --once and after the job is processed, we restore a snapshot to ensure we always have the same environment for any job run. This is possible because when the snapshot is taken the runner is already registered.

But by deregistering it at the end of the job when --ephemeral is used, the runner will not be able to start again after the snapshot is restored, as its credentials got invalided when the runner was deregistered on the server-side.

A workaround would be to have on the machine credentials with power to register a new runner when the snapshot is taken. But this poses a security risk that I'm not willing to take.

You might have some idea as GitHub manages its on fleet of macOS runners and solves this same problem somehow.

thboop · 2021-09-17T16:23:27Z

Hey @t0rr3sp3dr0 ,

Thanks for the feedback here. A few things our team is working on:

We aren't going to remove --once flag just yet. Our team views ephemeral as the future, but we want to ensure we don't break your current workflows as we move towards it
We are looking into possibly providing a separate rate limit for the runner registration endpoint with a higher value, it wouldn't count against your normal rate limit quota. Rate limits allow us to prevent malicious users from doing denial of service attacks, but we want it high enough to not impact legit usage. I'll follow up with more exact details on numbers here in the near future.
We are working on updating our docs with best practices to manage a pool of ephemeral runners, we will touch on caching and how to manage tokens in those docs. Typically this is an operator pattern, as you are already using on linux. Our macOS machines get a token passed in in a similar pattern

Please follow up if you have any questions or concerns, and we appreciate the feedback!

t0rr3sp3dr0 · 2021-09-20T03:30:33Z

Thanks for the info, @thboop!

I just got worried because there’s a comment on the --once flag suggesting it would get removed by October 2021.

runner/src/Runner.Common/Constants.cs

Line 131 in f259e57

public static readonly string Once = "once"; // TODO: Remove in 10/2021

MichaelJJ · 2021-09-21T18:56:06Z

Providing a separate rate limit for the runner registration endpoint would be a great help, we're looking to use this for a large enterprise with many repos, so caching the token may not help that much.

thboop · 2021-09-27T19:19:40Z

We won't be deleting --once in the near future, and we just rolled out a new rate limit for just the runner registration endpoint that supports (10,000) registrations an hour. If this limit isn't high enough for your use case, please create a support ticket and we can move forward from there.

You can see the status of this new rate limit at the https://api.github.com/rate_limit endpoint if you are authenticated.

"actions_runner_registration": {
        "limit": 10000,
        "used": 1,
        "remaining": 9999,
        "reset": 1632884000
}

Going to close this out.

MichaelJJ · 2021-09-27T20:21:10Z

@thboop Does the endpoint limit count for enterprise, organization and repo registration tokens?

I've tested all 3 and I see 0 used for actions_runner_registration but the core used count increases instead.

MichaelJJ · 2021-09-28T22:29:45Z

@thboop As a heads up, tried again today, the actions_runner_registration is not being used, still seeing API limit being consumed. I'm using an OAuth token generated from an app installed in an enterprise.

I requested both enterprise and repository runner tokens.

"actions_runner_registration": {
            "limit": 10000,
            "used": 0,
            "remaining": 10000,
            "reset": 1632871719
        }

thboop · 2021-09-28T22:56:49Z

Hey @MichaelJJ,
the limit only applies to registering a runner when using ./config.sh or the windows equivalent. The actual enterprise and repository tokens are cacheable for up to an hour, so it's not as pressing to have a separate rate limit for them. You aren't able to cache registering an ephemeral runner, so we wanted to give a higher limit for that.

If changing or configuring those rate limits would be helpful for you, I'd be curious what your use case is. We could definitely do it, feel free to open an issue and please provide more information on your motivation for it!

MichaelJJ · 2021-09-28T23:02:46Z

We are using the endpoints: https://api.github.com/enterprises/enterprise/actions/runners/registration-token and https://api.github.com/orgs/org/actions/runners/registration-token

We are looking to use the https://github.com/actions-runner-controller/actions-runner-controller and our own tools to create runner tokens for registering runners, also our enterprise has few enterprise users so we have a proxy API on-premise to allow a normal user to get a registration token.

I'll create an issue to have the enterprise/org endpoints added to the separate rate limit.

t0rr3sp3dr0 added the enhancement New feature or request label Sep 14, 2021

thboop mentioned this issue Sep 21, 2021

Support configure runner as ephemeral. #660

Merged

thboop closed this as completed Sep 27, 2021

toast-gear mentioned this issue Dec 27, 2021

Ephemeral runner actions/actions-runner-controller#831

Closed

toast-gear mentioned this issue Mar 9, 2022

Remove the ability to use the --once flag entirely only allowing --ephemeral to be set actions/actions-runner-controller#1196

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't Remove Support to `--once` #1339

Don't Remove Support to `--once` #1339

t0rr3sp3dr0 commented Sep 14, 2021

TingluoHuang commented Sep 15, 2021

t0rr3sp3dr0 commented Sep 15, 2021

thboop commented Sep 15, 2021 •

edited

Loading

t0rr3sp3dr0 commented Sep 16, 2021

t0rr3sp3dr0 commented Sep 16, 2021

thboop commented Sep 17, 2021 •

edited

Loading

t0rr3sp3dr0 commented Sep 20, 2021

MichaelJJ commented Sep 21, 2021

thboop commented Sep 27, 2021 •

edited

Loading

MichaelJJ commented Sep 27, 2021 •

edited

Loading

MichaelJJ commented Sep 28, 2021 •

edited

Loading

thboop commented Sep 28, 2021 •

edited

Loading

MichaelJJ commented Sep 28, 2021 •

edited

Loading

Don't Remove Support to --once #1339

Don't Remove Support to --once #1339

Comments

t0rr3sp3dr0 commented Sep 14, 2021

TingluoHuang commented Sep 15, 2021

t0rr3sp3dr0 commented Sep 15, 2021

thboop commented Sep 15, 2021 • edited Loading

t0rr3sp3dr0 commented Sep 16, 2021

t0rr3sp3dr0 commented Sep 16, 2021

thboop commented Sep 17, 2021 • edited Loading

t0rr3sp3dr0 commented Sep 20, 2021

MichaelJJ commented Sep 21, 2021

thboop commented Sep 27, 2021 • edited Loading

MichaelJJ commented Sep 27, 2021 • edited Loading

MichaelJJ commented Sep 28, 2021 • edited Loading

thboop commented Sep 28, 2021 • edited Loading

MichaelJJ commented Sep 28, 2021 • edited Loading

Don't Remove Support to `--once` #1339

Don't Remove Support to `--once` #1339

thboop commented Sep 15, 2021 •

edited

Loading

thboop commented Sep 17, 2021 •

edited

Loading

thboop commented Sep 27, 2021 •

edited

Loading

MichaelJJ commented Sep 27, 2021 •

edited

Loading

MichaelJJ commented Sep 28, 2021 •

edited

Loading

thboop commented Sep 28, 2021 •

edited

Loading

MichaelJJ commented Sep 28, 2021 •

edited

Loading