-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The new SeparateBodyFileCache design makes updates/reads non-atomic #324
Comments
This is my fault, by the way... 😢 We discussed various solutions on the pip side here: pypa/pip#12361 |
Thanks for the report!
This makes sense to me -- if I'm understanding correctly, this would mean replacing |
Yeah, plus have a single write API function. For cross-file atomicity, writing the body file first might do the trick, if reading code reads the metadata file first. This would allow the file format to stay the same, at least. (Would also be nice if write API didn't take the full bytes, since that means you have to keep full response in memory, but that maybe a bit too much work. Or could use the mmap hack again to work around that and accepts buffer API-implementing objects.) |
Note that a combination of fixing this and #235 in the ways suggested in previous comment, plus
|
The problem
Scenario 1
Program crashes after writing metadata but before writing body. Now the cache thinks the body is empty.
Scenario 2
Multiple processes using the same cache directory. Process A writes metadata, but not yet body. Process B gets metadata, then gets body, and now it thinks the body is empty.
Workarounds
For
pip
I added logic that basically pretends a cache entry is missing if it doesn't have both body and metadata files.Solutions
Probably need both a new API that involves writing and reading both at once, and an implementation that makes sure that's atomic. See also the somewhat related #325.
The text was updated successfully, but these errors were encountered: