Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow use action archive cache to speed up workflow jobs. #2857

Merged
merged 3 commits into from
Sep 28, 2023

Conversation

TingluoHuang
Copy link
Member

@TingluoHuang TingluoHuang commented Sep 18, 2023

To avoid network roundtrip for downloading each action's archive and speed up workflow jobs, we are introduce a new ENV to the runner, and the runner will try to find the action tarball/zipball of the resolved version in the action archive cache folder and avoid downloading it from the service.

The cache folder will be point by ACTIONS_RUNNER_ACTION_ARCHIVE_CACHE env and have the following structure:

For Linux/macOS:

./actions_checkout/
  ./SHA1.tar.gz
  ./SHA2.tar.gz
  ./SHA3.tar.gz
  ./SHA4.tar.gz
  ....
./actions_cache/
  ./SHA1.tar.gz
  ./SHA2.tar.gz
  ./SHA3.tar.gz
  ./SHA4.tar.gz
  ....

For Windows:

./actions_checkout/
  ./SHA1.zip
  ./SHA2.zip
  ./SHA3.zip
  ./SHA4.zip
  ....
./actions_cache/
  ./SHA1.zip
  ./SHA2.zip
  ./SHA3.zip
  ./SHA4.zip
  ....

The tarball or zipball will be the content from https://docs.github.com/en/rest/repos/contents?apiVersion=2022-11-28#download-a-repository-archive-tar

https://github.com/github/actions-assignment/issues/17

@TingluoHuang TingluoHuang requested a review from a team as a code owner September 18, 2023 19:10
@TingluoHuang TingluoHuang force-pushed the users/tihuang/actioncache branch 2 times, most recently from 474d7a3 to 5620568 Compare September 21, 2023 00:10
@@ -778,11 +780,6 @@ private async Task DownloadRepositoryActionAsync(IExecutionContext executionCont
executionContext.Output($"Download action repository '{downloadInfo.NameWithOwner}@{downloadInfo.Ref}' (SHA:{downloadInfo.ResolvedSha})");
}

await DownloadRepositoryActionAsync(executionContext, downloadInfo, destDirectory);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i removed the nested method DownloadRepositoryActionAsync(arg1, arg2, arg3) out of the DownloadRepositoryActionAsync(arg1, arg2)

Trace.Info($"Save archive '{link}' into {archiveFile}.");
try
{
int retryCount = 0;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i move the entire download logic behind the new DownloadRepositoryArchive()
Should not have impact on functionality.

@@ -796,7 +796,48 @@ private async Task DownloadRepositoryActionAsync(IExecutionContext executionCont
Trace.Info($"Save archive '{link}' into {archiveFile}.");
try
{
await DownloadRepositoryArchive(executionContext, link, downloadInfo.Authentication?.Token, archiveFile);
var useActionArchiveCache = false;
if (executionContext.Global.Variables.GetBoolean("DistributedTask.UseActionArchiveCache") == true)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FF controlled by the service.

@@ -795,97 +792,50 @@ private async Task DownloadRepositoryActionAsync(IExecutionContext executionCont
string link = downloadInfo?.TarballUrl;
#endif

Trace.Info($"Save archive '{link}' into {archiveFile}.");
Copy link
Member Author

@TingluoHuang TingluoHuang Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the diff is not readable.
I break the change into 3 commits, which might explain the diff better.

  • 3722412 move code around and should not change any behavior.
  • 356d799 change behind FF that use local actions cache instead of downloading from the service.
  • bdcbefc cleanup.

@TingluoHuang TingluoHuang force-pushed the users/tihuang/actioncache branch 3 times, most recently from e870708 to bdcbefc Compare September 28, 2023 15:30
@TingluoHuang TingluoHuang enabled auto-merge (squash) September 28, 2023 23:34
@TingluoHuang TingluoHuang merged commit f57ecd8 into main Sep 28, 2023
10 checks passed
@TingluoHuang TingluoHuang deleted the users/tihuang/actioncache branch September 28, 2023 23:43
@alexklibisz
Copy link

This is exactly what we've been looking for to mitigate some API rate limiting issues. Can anyone provide an estimate for when this would go out in a release?

@TingluoHuang
Copy link
Member Author

@alexklibisz do you have a log shows me where you are getting rate limiting? i want to double check to make sure this feature actually can fix your problem.
If you are not feel comfortable of posting log here, you can email it to me with tingluohuang @ github.com

@alexklibisz
Copy link

I don't have the exact log on hand, but it's something like "Github API rate limit exceeded while cloning action "actions/checkout@v3". This is on a GHE deployment. I haven't seen this happen on public Github. But it seems like we could avoid a lot of redundant requests by just caching the downloaded actions, as in this PR.

@retzero
Copy link

retzero commented Oct 9, 2023

@TingluoHuang
Thank you for this great feature.
How can I enable this variable UseActionArchiveCache to true? I'm running GHES and self-hosted runners.

                if (executionContext.Global.Variables.GetBoolean("DistributedTask.UseActionArchiveCache") == true)

@TingluoHuang
Copy link
Member Author

This feature will not help the Rate limit problem happen during Resolving Action (translate the v3 tag to a SHA for a given action), but it will help avoid the download of the action if the local cache exists.
@retzero The feature flag (the DistributedTask.UseActionArchiveCache variable) will be enabled in the next GHES release once we verify everything is working as expected with github.com

@alexklibisz
Copy link

This feature will not help the Rate limit problem happen during Resolving Action (translate the v3 tag to a SHA for a given action)

That's too bad. What if we explicitly use @SHA instead of @v3? Would that eliminate an API call?

@TingluoHuang
Copy link
Member Author

@alexklibisz that's not a bad idea, we can make change to let the runner skip the resolving API call if you are referencing actions via @SHA and the local cache has it already.
Let me talk with my team to see what's their thought on this. 🙇

@alexklibisz
Copy link

@alexklibisz that's not a bad idea, we can make change to let the runner skip the resolving API call if you are referencing actions via @SHA and the local cache has it already. Let me talk with my team to see what's their thought on this. 🙇

Thanks! These rate limiting issues have been pretty disruptive, so anything that can reduce API calls when the jobs are starting is helpful.

For context, we are using Actions Runner Controller with Horizontal Runner Autoscalers and Runner Deployments on top of AWS spot instances. We really like the scalability and the ephemeral nature of the runners. But as we've onboarded more projects, we're getting disrupted by these API rate limits basically daily. The API rate limits seem reasonable. It seems like there is some room for optimization in the runner initialization. I think these optimizations will be important as Github has officially adopted/recommended Actions Runner Controller and teams are starting to adopt it.

@alexklibisz
Copy link

@TingluoHuang Can you help me understand one thing more precisely:

but it will help avoid the download of the action if the local cache exists.

Does downloading the action count against the API rate limit? In other words, will the change in this PR help at all with rate limit issues?

@TingluoHuang
Copy link
Member Author

@alexklibisz bad news is my change won't help rate limiting at all, there are some action policy we need to check when the runner try to use certain actions, so we have to call the resolving actions endpoint. 😞

If you are hitting rate limiting on your GHES, you can reach out to GitHub support, and they should be able to help you bump the rate limit setting on your GHES instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants