-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fscache: teach fscache to use NtQueryDirectoryFile #1937
fscache: teach fscache to use NtQueryDirectoryFile #1937
Conversation
I'm curious the impact on smaller, more reasonably sized repositories; especially given the explicit I'm also curious about the impact on very flat repositories, which are rather common in OSS projects. Effectively 90% of the files are in the root, with a few specialized files in sub-dirs. |
I tested this on the git repo itself and saw a 9.8% savings but with small repos, the difference is between .0305 seconds and .0275 seconds (5 runs averaged) so we're getting down to the noise level. In short, with small repos it just doesn't make make much difference - they are already very fast. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a couple minor changes I'd like to see, most of them some clarifications in the commit message (in particular a link to more information about the low-level API used by the new code).
Thank you!
The code in question is unclear, and not everbody has the time to dig up the commit message for the commit that added it, so let's play nice and add an explanation as a code comment. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Using FindFirstFileExW() requires the OS to allocate a 64K buffer for each directory and then free it when we call FindClose(). Update fscache to call the underlying kernel API NtQueryDirectoryFile so that we can do the buffer management ourselves. That allows us to allocate a single buffer for the lifetime of the cache and reuse it for each directory. This change improves performance of 'git status' by 18% in a repo with ~200K files and 30k folders. Documentation for NtQueryDirectoryFile can be found at: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntifs/nf-ntifs-ntquerydirectoryfile https://docs.microsoft.com/en-us/windows/desktop/FileIO/file-attribute-constants https://docs.microsoft.com/en-us/windows/desktop/fileio/reparse-point-tags To determine if the specified directory is a symbolic link, inspect the FileAttributes member to see if the FILE_ATTRIBUTE_REPARSE_POINT flag is set. If so, EaSize will contain the reparse tag (this is a so far undocumented feature, but confirmed by the NTFS developers). To determine if the reparse point is a symbolic link (and not some other form of reparse point), test whether the tag value equals the value IO_REPARSE_TAG_SYMLINK. Signed-off-by: Ben Peart <benpeart@microsoft.com>
Let's merge this! Thank you so much! |
The FSCache feature [was further optimized in particular for very large repositories](git-for-windows/git#1937). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…DirectoryFile-gfw fscache: teach fscache to use NtQueryDirectoryFile
…DirectoryFile-gfw fscache: teach fscache to use NtQueryDirectoryFile
…DirectoryFile-gfw fscache: teach fscache to use NtQueryDirectoryFile
…DirectoryFile-gfw fscache: teach fscache to use NtQueryDirectoryFile
…DirectoryFile-gfw fscache: teach fscache to use NtQueryDirectoryFile
…DirectoryFile-gfw fscache: teach fscache to use NtQueryDirectoryFile
…DirectoryFile-gfw fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
…DirectoryFile-gfw fscache: teach fscache to use NtQueryDirectoryFile
fscache: teach fscache to use NtQueryDirectoryFile
Using FindFirstFileExW() requires the OS to allocate a 64K buffer for each directory and then free it when we call FindClose(). Update fscache to call the underlying kernel API NtQueryDirectoryFile so that we can do the buffer management ourselves. That allows us to allocate a single buffer for the lifetime of the cache and reuse it for each directory.
This change improves performance of 'git status' by 18% in a repo with ~200K and 30k folders.
Documentation for NtQueryDirectoryFile can be found at:
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntifs/nf-ntifs-ntquerydirectoryfile
https://docs.microsoft.com/en-us/windows/desktop/FileIO/file-attribute-constants
https://docs.microsoft.com/en-us/windows/desktop/fileio/reparse-point-tags
To determine if the specified directory is a mounted folder, inspect the FileAttributes member to see if the FILE_ATTRIBUTE_REPARSE_POINT flag is set. If so, EaSize will contain the reparse tag. To determine if the reparse point is a mounted folder (and not some other form of reparse point), test whether the tag value equals the value IO_REPARSE_TAG_MOUNT_POINT.
Signed-off-by: Ben Peart benpeart@microsoft.com