"Cache already exists" due to concurrency #537

Vampire · 2020-08-02T23:05:21Z

If you have a look at this workflow run: https://github.com/Vampire/setup-wsl/runs/938421512?check_suite_focus=true you see the problem.
Here the timeline of the problem:

22:30:04 Job `Test 'Debian' distribution` / Step ` Execute action`: ##[debug]Resolved Keys
22:30:04 Job `Test 'Debian' distribution` / Step ` Execute action`: Cache not found => download and later add to cache
22:30:08 Job `Test additional packages for 'Debian' distribution` / Step `Execute action`: ##[debug]Resolved Keys:
22:30:08 Job `Test additional packages for 'Debian' distribution` / Step `Execute action`: Cache not found => download and later add to cache
22:30:09 Job `Test 'Debian' distribution` / Step ` Execute action`: ##[debug]Reserving Cache
22:30:11 Job `Test additional packages for 'Debian' distribution` / Step ` Execute action`: ##[debug]Reserving Cache
22:30:11 Job `Test additional packages for 'Debian' distribution` / Step ` Execute action`: ##[error]reserveCache failed
22:30:14 Job `Test 'Debian' distribution` / Step ` Execute action`: Cache saved successfully

The problem is clear, the typical problem you have with lazy initialization if you don't do the double-checked locking properly.
Both jobs see the cache is not present, prepare the thing to be cached and then try to cache it, while the second fails at caching.

Is there a clean way to solve this, or can something be added like for example a method cache.exists() that only checks whether the cache exists? Doing a full restore before doing the save seems a bit overkill just to check whether in the meantime another job already filled the cache. And just catching the exception feels like "exception for flow control" if there could maybe be a simple test method to check up front.

The text was updated successfully, but these errors were encountered:

JojOatXGME · 2020-08-11T20:45:17Z

I created my own action and I seem to get a wanring caused by this error quite reliably on every build (if the cache was not already present before I started it). It is annoying.

##[warning]Error: reserveCache failed: Cache already exists. Scope: refs/heads/***, Key: ***, Version: ***

If we look at the actions/cache action, they already have implemented a special case to prevent such warnings. I did the same in my action.

// https://github.com/actions/cache/blob/d29c1df198dd38ac88e0ae23a2881b99c2d20e68/src/save.ts#L43
} catch (error) {
    if (error.name === cache.ValidationError.name) {
        throw error;
    } else if (error.name === cache.ReserveCacheError.name) {
        core.info(error.message);
    } else {
        utils.logWarning(error.message);
    }
}

This ReserveCacheError is supposed to be thrown here:

// https://github.com/actions/toolkit/blob/9ad01e4fd30025e8858650d38e95cfe9193a3222/packages/cache/src/cache.ts#L146
const cacheId = await cacheHttpClient.reserveCache(key, paths, {
  compressionMethod
})
if (cacheId === -1) {
  throw new ReserveCacheError(
    `Unable to reserve cache with key ${key}, another job may be creating this cache.`
  )
}

Unfortunatly, another generic error seems to be thrown before we reach that point. I think it is initially thown by the @actions/http-client package and converted in requestUtils.retry(). However, I just looked at it for 2 or 3 hours on GitHub and haven't debuged it. Maybe it works if the cache is not yet created but just reserved. However, I think this library should not handle the cases that differently and should allow us to omit the warning.

PS: Something like cache.exists() would still have a race condition because another job could create the cache after you call this method but before you upload your own data.

Vampire · 2020-08-11T20:50:20Z

Then maybe cache.reserve() that reserves the cache if it is not reserved or present yet and returns a boolean instead of throwing an exception.

Btw. I copied the exception handling from @action/cache too in the meantime :-D

JojOatXGME · 2020-08-11T21:31:31Z

Yes, returning a boolean instead of throwing an exception might be a better API. Anyway, I would already be happy if @actions/cache would reliably throw ReserveCacheError instead of (at least sometimes) throwing a generic Error.

dhadka · 2020-08-16T03:32:36Z

Thanks for reporting this. There's definitely an issue with handling non-successful status codes. It looks like there were a series of changes that led up to this. Caching used to call the get method in http-client which would return a http response object with the status code, that was changed to getJson which instead throws an error for 3xx, 4xx, and 5xx response codes, but the caching module is still just checking the status code in the response.

If the cache already exists, the server returns a 409 Conflict so http-client throws an error, whereas the expected behavior is reserveCache returns -1. I'll take a look at this next week.

dhadka · 2020-08-18T18:48:09Z

Before fixing the cache module, will need to fix http-client so it provides the status code in a type safe way (since actions/toolkit uses strict type checking) - actions/http-client#32

weibeld · 2020-10-09T16:12:55Z

Is there any timeline for when this will be fixed?

Currently, I'm using the following workaround:

try {
  await cache.saveCache(paths, key);
} catch (err) {
  if (err.message.includes("Cache already exists")) {
    core.info(`Cache entry ${key} has already been created by another worfklow`);
  } else {
    throw err;
  }
}

This seems to work to catch and neutralise the error. The error is detected through the error message because the type of the error seems to be just Error and not the more useful ReserveCacheError that would be intended for this situation, as described in @JojOatXGME's comment. However, since it's a race condition, it's hard to test and debug properly.

In the previous code there was a potential race condition when two or more workflows checked the cache at the same time: 1. Workflow 1: cache miss; Workflow 2: cache miss 2. Workflow 1: install; Workflow 2: install 3. Workflow 1: save cache; Workflow 2: save cache <<< RACE CONDITION This resulted in the step failing with the following error: Error: reserveCache failed: Cache already exists. Scope: refs/heads/master, Key: <key>, Version: <...> It's a bug in the @actions/cache package and it's discussed in the following issue: actions/toolkit#537 The correct behaviour would be for Workflow 2 to simply drop its attempt to save the cache entry when this situation occurs. For now, this code uses a workaround to fix this issue which works by detecting and catchin the above error based on its error message.

dhadka · 2020-10-12T16:41:20Z

@weibeld Thanks for the ping. The changes to http-client have been merged and the new version (1.0.9) published to NPM. I'll try to get the rest of the fix described above implemented this week.

dhadka · 2020-10-12T18:51:52Z

PR for the cache module: #558

Once this is merged and published, we'll also need to bump the version used by the actions/cache action.

Vampire added the bug Something isn't working label Aug 2, 2020

mondash mentioned this issue Oct 3, 2020

Replace Cypress fork of actions/cache with official @actions/cache package expo/expo-github-action#62

Merged

lukka mentioned this issue Oct 7, 2020

Saving to cache fails on macos-latest lukka/get-cmake#14

Closed

lukka added a commit to lukka/get-cmake that referenced this issue Oct 7, 2020

workaround actions/toolkit#537

d3216f7

lukka added a commit to lukka/get-cmake that referenced this issue Oct 7, 2020

workaround actions/toolkit#537

f10b27d

dhadka mentioned this issue Oct 12, 2020

Handle errors representing non-successful http responses in retry logic #558

Merged

dhadka closed this as completed in #558 Oct 20, 2020

snyk-bot mentioned this issue May 6, 2021

[Snyk] Upgrade uuid from 3.3.2 to 8.3.2 affiliatedkat/toolkit#8

Open

MarcelRaschke mentioned this issue Aug 19, 2022

[Snyk] Upgrade uuid from 3.3.2 to 8.3.2 47-studio-org/toolkit#10

Open

dbalde mentioned this issue Jun 20, 2023

[Snyk] Security upgrade semver from 6.3.0 to 7.5.2 dbalde/toolkit#33

Open

LadyK-21 mentioned this issue Jun 20, 2023

[Snyk] Security upgrade semver from 6.3.0 to 7.5.2 LadyK-21/toolkit#25

Open

eliasbuchwald mentioned this issue Jun 20, 2023

[Snyk] Security upgrade semver from 6.3.0 to 7.5.2 eliasbuchwald/toolkit#24

Open

ron-a20 mentioned this issue Jun 21, 2023

[Snyk] Security upgrade semver from 6.3.0 to 7.5.2 ron-a20/toolkit#5

Open

Jsn2win mentioned this issue Jun 21, 2023

[Snyk] Security upgrade semver from 6.3.0 to 7.5.2 Jsn2win/toolkit#43

Open

Oleksandr1201vsp mentioned this issue Jun 21, 2023

[Snyk] Security upgrade semver from 6.3.0 to 7.5.2 yakostakua/toolkit#14

Open

This was referenced Jun 21, 2023

[Snyk] Security upgrade semver from 6.3.0 to 7.5.2 irinazheltisheva/toolkit#12

Open

[Snyk] Security upgrade semver from 6.3.0 to 7.5.2 irinazheltisheva/toolkit#13

Open

Oleksandr1201vsp mentioned this issue Jun 21, 2023

[Snyk] Security upgrade semver from 6.3.0 to 7.5.2 yakostakua/toolkit#15

Open

MaxMood96 mentioned this issue Jun 22, 2023

[Snyk] Security upgrade semver from 6.3.0 to 7.5.2 MaxMood96/toolkit#82

Open

Oleksandr1201vsp mentioned this issue Jun 22, 2023

[Snyk] Security upgrade semver from 6.3.0 to 7.5.2 yakos-takua/toolkit#49

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Cache already exists" due to concurrency #537

"Cache already exists" due to concurrency #537

Vampire commented Aug 2, 2020

JojOatXGME commented Aug 11, 2020 •

edited

Loading

Vampire commented Aug 11, 2020

JojOatXGME commented Aug 11, 2020 •

edited

Loading

dhadka commented Aug 16, 2020 •

edited

Loading

dhadka commented Aug 18, 2020

weibeld commented Oct 9, 2020

dhadka commented Oct 12, 2020

dhadka commented Oct 12, 2020

"Cache already exists" due to concurrency #537

"Cache already exists" due to concurrency #537

Comments

Vampire commented Aug 2, 2020

JojOatXGME commented Aug 11, 2020 • edited Loading

Vampire commented Aug 11, 2020

JojOatXGME commented Aug 11, 2020 • edited Loading

dhadka commented Aug 16, 2020 • edited Loading

dhadka commented Aug 18, 2020

weibeld commented Oct 9, 2020

dhadka commented Oct 12, 2020

dhadka commented Oct 12, 2020

JojOatXGME commented Aug 11, 2020 •

edited

Loading

JojOatXGME commented Aug 11, 2020 •

edited

Loading

dhadka commented Aug 16, 2020 •

edited

Loading