Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cache endpoint to allowed list automatically #173

Closed
varunsh-coder opened this issue Sep 1, 2022 · 9 comments · Fixed by #186
Closed

Add cache endpoint to allowed list automatically #173

varunsh-coder opened this issue Sep 1, 2022 · 9 comments · Fixed by #186

Comments

@varunsh-coder
Copy link
Member

This should help address issue #83. @h0x0er please take this up.

We will need to figure out how to inform agent/ insights API that this is the cache endpoint, so we do not show it in the insights page.
Once this is done, users will not need to be concerned about the cache endpoint. We will not ask users to add it to allowed list explicitly.

@h0x0er
Copy link
Member

h0x0er commented Sep 5, 2022

@varunsh-coder In source-code of toolkit/cache, they are using a suffix to figure out where to fetch cache. Similarly we can use it to detect cache endpoint.
Checkout

@varunsh-coder
Copy link
Member Author

We need to do the same steps as done here: https://github.com/actions/toolkit/blob/500d0b42fee2552ae9eeb5933091fe2fbf14e72d/packages/cache/src/internal/cacheHttpClient.ts#L114

We need to write a method in nodejs and get this info before initialize the agent.

Somewhere here we need to get the cache artifact location, and pass that as a config to the agent.

const confg = {

@h0x0er
Copy link
Member

h0x0er commented Sep 10, 2022

@varunsh-coder I experimented with a workflow, to check if I am able to fetch archiveLocation, if we know cacheID and cachePaths, the result is positive, i am able to fetch the archiveLocation.

In the experiment, I performed following steps;

steps in harden-runner

  • added a static cacheID & cachePath in the postStep of harden-runner, here
  • trying to restore cache, here
  • trying to fetch archiveLocation, here

steps in workflow

  • ran the same workflow twice,
    • firstRun to store cache;
    • secondRun to get archiveLocation here

@varunsh-coder
Copy link
Member Author

Great work @h0x0er!

Now that the proof of concept is done, next step is to integrate the changes into harden runner.

I noticed that it takes long time to get the cache entry (total time 36 seconds). Why is that? We need to keep the pre-harden-runner step under 5 seconds.

Here are the next steps after the performance issue is resolved:

  1. The file that is added to cache should ideally be in the harden runner folder. We can add a file to the dist folder, similar to agent.service file.
  2. Get the endpoint for cache location. Currently you get the full cache endpoint and path. You need to get only the endpoint.
  3. Add the endpoint to the allowed list in block mode
  4. Log the cache endpoint as a normal log, so we can correlate it in the API and not show it. Else one might add it manually to the list.
  5. If cache endpoint cannot be fetched for whatever reason, and mode is block, change the mode from block to audit, and log the issue as a warning annotation.
  6. Add test cases for part where endpoint is fetched
  7. Test for cases where different calls may fail. This method to fetch the archive location must not cause the step to fail.

@varunsh-coder
Copy link
Member Author

@h0x0er I merged the PR into cache-solve branch and ran integration test workflows, but it is unable to fetch the cache URL.

I suspect the saving of the cache is not at the right place. Should this not be outside the if (!fs.existsSync(doneFile)) { condition? That condition will not run for normal cases.

const cmd = "sudo";

You can see the runs here:
https://github.com/harden-runner-canary/setup-php/actions/runs/3050458696/jobs/4917573546#step:20:39

I don't see any log for saving of cache.

@h0x0er h0x0er mentioned this issue Sep 14, 2022
@h0x0er
Copy link
Member

h0x0er commented Sep 14, 2022

@varunsh-coder Thanks for pointing the issue, it was a mistake from my side while i was resolving the merge conflicts.
I had created a new PR #178, that fixes the bug..

@varunsh-coder
Copy link
Member Author

Thanks @h0x0er! Can you also put those statements in a try catch block? Don't want the saveCache method to cause step to fail.

@h0x0er
Copy link
Member

h0x0er commented Sep 14, 2022

@varunsh-coder Done the above change

@varunsh-coder
Copy link
Member Author

varunsh-coder commented Sep 24, 2022

Hi @h0x0er, I reviewed the changes and we need to make few more changes:

  1. The cache action looks for a isValidEvent before saving or restoring cache. We also need to check for same and only do the cache operations if the event is valid. If not valid, then cache endpoint will anyways not be hit, so it is not needed.
    https://github.com/actions/cache/blob/12681847c623a9274356751fdf0a63576ff3f846/src/save.ts#L18
    Also, try to understand what event is not valid, so we can create an integration test for that scenario.

  2. Not sure if sudo is needed here. We have a file write operation in the same file and that does not use sudo. Please see if this works without sudo.

    const cmd = "sudo";

  3. Correlation Id log needs to be moved above the cache log statement

    console.log(`Step Security Job Correlation ID: ${correlation_id}`);

  4. If cache endpoint cannot be found, the log should be of normal type, not warning. I think this might confuse users if it is of warning type.

    core.warning(

Please create PR into cache-solve branch. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants