-
-
Notifications
You must be signed in to change notification settings - Fork 640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leverage a multiprocess-safe strategy for MyPy's cache #16290
Conversation
[ci skip-rust] [ci skip-build-wheels]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you determine the copying around was necessary. As I read it there is a write fnctl lock that naively should make that un-necessary: https://www.sqlite.org/faq.html#q5
I guess maybe you hit lock timeouts? It's worth explaining though as is the current strategy since it naively throws away some percent of cached data from racing reads and writes and cp and mv. Safe, but lossy. Perhaps the lossy is small enough in practice this is still a big win? This is all the bread and butter of this change and deserves explanation, it's all very non-obvious - to me anyhow. |
It looks like the default busy timeout setup by CPython is 5 seconds on main: https://github.com/python/cpython/blob/6446592c89b0c581c00e170ae6278291e940755c/Modules/_sqlite/clinic/connection.c.h#L23 So, if I read all the code and SQLite docs correctly, the default busy timeout is 5s. |
Mypy uses multiple rows for each source file and retrieves them in two different queries. So atomic writes/writes don't solve the race condition between the calls. It basically replicates the FS in the DB (one row per FS file). And yeah the performance is still greatly improved. In general, I would imagine most users aren't using multiple processes for a single run of mypy on a single python version. In which case we only pay for the cp and mv, we aren't losing anything else. E.g. P1 asks for the "metadata" row and determines it needs to read the "data" row and gets paused, P2 writes to "metadata" row and "data" row, P1 then reads the new "data" row. I'll update the comment. |
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Co-authored-by: Stu Hood <stuhood@gmail.com>
Co-authored-by: Stu Hood <stuhood@gmail.com>
) pantsbuild#16276 added support for using MyPy's cache combined with `append_only_caches`. After investigation, however, it was found to be grossly unsafe for multiple processes. This PR fixes that by shifting the multiprocess safety into Pants' support. To do that we: - Leverage MyPy's SQLite file cache. This means we only need to make a single file "safe". - To make the file access "safe", we `cp` it to a temporary location and use that for the run. Then we `mv` the updated file back. This is "safe" because `mv` is an atomic `rename` (on the same filesystem) which doesn't invalidate existing file descriptors (to the old file). Each concurrent `cp` command should continue to operate on the "old" file, while and future ones will be given the new file. I also updated docs. [ci skip-rust] [ci skip-build-wheels]
#16276 added support for using MyPy's cache combined with
append_only_caches
. After investigation, however, it was found to be grossly unsafe for multiple processes.This PR fixes that by shifting the multiprocess safety into Pants' support. To do that we:
cp
it to a temporary location and use that for the run. Then wemv
the updated file back. This is "safe" becausemv
is an atomicrename
(on the same filesystem) which doesn't invalidate existing file descriptors (to the old file). Each concurrentcp
command should continue to operate on the "old" file, while and future ones will be given the new file.I also updated docs.
[ci skip-rust]
[ci skip-build-wheels]