Fix possible race condition in the caching of request graph #9675
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TL;DR: To fix some existing race conditions in the way the request graph is cached, the old cached graph is now deleted from disk at the start of the process to prevent stale references, and the cache/snapshot write has been moved until after the node-blobs processing rather than being queued.
As part of the incremental cache write system, we add all the nodes to be cached on to the queue, and then the request graph itself and the cache snapshot.
We're currently having some problems with "hanging builds" in our application, which means that a number of users are killing the Parcel process through more drastic means once their
ctrl C
has taken too long.There are two possible scenarios that we think are causing issues in our application at the moment.
Problems
1. Missing nodes
The process is force quit before the nodes are finished being written, but after the request graph cache has been written.
Given the promise queue used in the cache system has concurrency inbuilt, it's possible that the request graph write would finish before the last couple of node-blob writes.
This means that the request graph written to cache would contain references to nodes that don't actually exist in the cache. Since the node IDs are index based, this is causing the graph to look up the wrong nodes.
2. Old request graph cache
The process is force quit after the nodes are finished being written, but before the request graph cache has been written.
This would result in all the new node-blobs being written to disk, but the old request graph remaining in place. This means that all its index-based IDs now, again, point to different nodes than it was expecting.
Solutions
This PR contains fixes for both of these issues.
To fix problem number 1, the write of the request graph and snapshot file have been removed from the queue system, and now wait until after the queue has been flushed. This means that the request graph cache file won't be written until all of the node-blobs are in place.
To fix problem number 2, the cached request graph is deleted once we start writing a new cache. To achieve this, a new
deleteLargeBlob
method has been added to the parentCache
type, and implemented for each different cache variant. This means that if a user force-quits the process before the new request graph cache has been written, there will be no cached graph at all, eliminating the stale references problem.