-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Very poor write performance [Windows FlushFileBuffers] #516
Comments
@djherbis Can you explain why do you think this is slow? Is this running on a HDD or SSD? If this is HDD, this is exact what you should expect since it limits by disk seek time. |
My temp directory is on my SSD, so it looks like that's what I've been testing against. |
On my SSD and SSD on GCE, boltdb(current git master) works as expected
I get similar result to yours when running on HDD. I would suggest you to check you disk first. |
There's where my performance lands on the SSD on my mac as well. I double-checked, my C: is my SSD and that's where the benchmark is being run. |
@djherbis Linux/OS X |
I'm running on Windows 10, I wonder if this might be OS specific... I might have to pprof this when I get a chance. Can anyone with Windows + SSD check their results? |
Just to verify that my drive wasn't the issue, I spun up a RAM disk and ran bolt bench on it.
|
@djherbis https://github.com/ongardie/diskbenchmark maybe you can try this to bench the disk io. boltdb does two fdatasyncs per writes I think. So you can get the ideal rough number if everything works well. |
The script there is a bash file, I'm running windows. As I mentioned above though, I've switch to using a RAM disk and its still slower than my mac's SSD. Here's svgs of the cpuprofiles of the mac vs. windows benchmarks: Windows seems to spend a lot of time in cgocall and osyield vs. usleep and syscall on my mac. fdatasync is definitely the bottleneck by the looks of those graphs. |
I did a little research, fdatasync on Windows uses FlushFileBuffers:
I built Go from source, and added those flags to the Windows syscall.CreateFile, // Extra flags for windows
const FILE_FLAG_NO_BUFFERING = 0x20000000
const FILE_FLAG_WRITE_THROUGH = 0x80000000
h, e := CreateFile(pathp, access, sharemode, sa, createmode, FILE_ATTRIBUTE_NORMAL|FILE_FLAG_NO_BUFFERING|FILE_FLAG_WRITE_THROUGH, 0) and commented out the file.Sync() call in bolt_windows.go func fdatasync(db *DB) error {
//return db.file.Sync()
return nil
} Here's my new output (on my Windows SSD):
Perhaps boltdb should create files with these flags in windows. |
@djherbis The result seems to be conflict with your experimental on RAM disk. If IO is an issue, why moving to RAM disk is even slower than a direct write? |
@xiang90 Because its not the disk speed that's the problem, the limiting factor here is that FlushFileBuffers is inefficient when used after every write. I believe that's because when you're going to flush everything to the disk after every write, its a waste of time to write it to the windows disk cache first. though this is speculation. |
Then why moving from disk to RAM disk does reduce the time from 5ms to 400us? |
I didn't mean to say that the disk speed doesn't matter, since it does, as is evident by the benchmarks. It's just that, more generally, all of the disks were suffering from poor FlushFileBuffers performance. Also, there may still need to be more work to do. It appears that using the FILE_FLAG_NO_BUFFERING comes with caveats that you have to have to use aligned writes. https://msdn.microsoft.com/en-us/library/windows/desktop/cc644950(v=vs.85).aspx I'm not certain how this works with memory mapped files. |
Looks like FILE_FLAG_NO_BUFFERING doesn't even work with memory mapped files. So I guess I'm just seeing a performance boost because I'm not syncing to the physical media. There is another method FlushViewOfFile which claims to flush memory mapped files, but it doesn't block for the data to be written and says to FlushFileBuffers to ensure that it gets written... Maybe I'll experiment and see if calling it first speeds up FlushFileBuffers... |
Quick update: Currently go test takes over 12 mins on my machine to run, which means I have to specify a custom timeout since go test normally gives up after 10. Longest tests are:
|
@djherbis We hit the same limit on the CI, drone.io. Some of the tests do long running randomized tests. You can run a subset of the tests by using the |
@benbjohnson Thanks, yeah I've used -short before which is great. The main issue is that I'm still having sizable performance issues on Windows, but I blame Windows for this since I feel its a limitation of FlushFileBuffers. Still hoping to find a more efficient way of syncing to the filesystem. |
@djherbis did you figure out the actual fix for the slow performance on windows? |
@siginfo Unfortunately no :( I haven't looked much further into this since I last posted. I'm still curious how other databases handle this issue on Windows, but I haven't had time to dig into it since I only use it for personal projects. |
When my test suite runs on Travis, it's very fast. However when I run it locally, its incredibly slow.
Here's the output of the bolt bench cmd:
The text was updated successfully, but these errors were encountered: