-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce single file format overhead by lazily importing modules and loading .gitignore / --exclude #3211
Conversation
For what it is worth, I'd prefer a rebase merge for this PR given the commits are mostly independent (other than the changelog entry). |
Coolio, diff-shades is failing and I don't know why at all ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadang bib
`black.reformat_many` depends on a lot of slow-to-import modules. When formatting simply a single file, the time paid to import those modules is totally wasted. So I moved `black.reformat_many` and its helpers to `black.concurrency` which is now *only* imported if there's more than one file to reformat. This way, running Black over a single file is snappier Here are the numbers before and after this patch running `python -m black --version`: - interpreted: 411 ms +- 9 ms -> 342 ms +- 7 ms: 1.20x faster - compiled: 365 ms +- 15 ms -> 304 ms +- 7 ms: 1.20x faster Co-authored-by: Fabio Zadrozny <fabiofz@gmail.com>
Loading .gitignore and compiling the exclude regex can take more than 15ms. We shouldn't and don't need to pay this cost if we're simply formatting files given on the command line directly. I would've loved to lazily import pathspec, but the patch won't be clean until the file collection and discovery logic is refactored first. Co-authored-by: Fabio Zadrozny <fabiofz@gmail.com>
os.cpu_count() can return None (sounds like a super arcane edge case though) so the type annotation for the `workers` parameter of `black.main` is wrong. This *could* technically cause a runtime TypeError since it'd trip one of mypyc's runtime type checks so we might as well fix it. Reading the documentation (and cross-checking with the source code), you are actually allowed to pass None as `max_workers` to `concurrent.futures.ProcessPoolExecutor`. If it is None, the pool initializer will simply call os.cpu_count() [^1] (defaulting to 1 if it returns None [^2]). It'll even round down the worker count to a level that's safe for Windows. ... so theoretically we don't even need to call os.cpu_count() ourselves, but our Windows limit is 60 (unlike the stdlib's 61) and I'd prefer not accidentally reintroducing a crash on machines with many, many CPU cores. [^1]: https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor [^2]: https://github.com/python/cpython/blob/a372a7d65320396d44e8beb976e3a6c382963d4e/Lib/concurrent/futures/process.py#L600
aafd5b9
to
2c0d427
Compare
PTAL. Please note I'd prefer a rebase merge here if you approve and chose to merge right after. |
workers: Optional[int], | ||
) -> None: | ||
"""Reformat multiple files using a ProcessPoolExecutor.""" | ||
maybe_install_uvloop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose to keep the helper function instead of always attempting an import of uvloop in the module scope just as insurance if we ever accidentally break the invariant black.concurrency
should only be imported via black.main
if there's more than one file to format.
This works on Windows with PyInstaller (just tested locally) so I'm merging. |
Description
Fixes #2564 and closes #2656.
Checklist - did you ...