Skip to content
This repository has been archived by the owner on Mar 9, 2019. It is now read-only.

Very poor write performance [Windows FlushFileBuffers] #516

Open
djherbis opened this issue Feb 20, 2016 · 20 comments
Open

Very poor write performance [Windows FlushFileBuffers] #516

djherbis opened this issue Feb 20, 2016 · 20 comments

Comments

@djherbis
Copy link

When my test suite runs on Travis, it's very fast. However when I run it locally, its incredibly slow.

Here's the output of the bolt bench cmd:

bolt bench --count 10000 --batch-size 1 --write-mode rnd
# Write 41.9365914s     (4.193659ms/op) (238 op/sec)
# Read  1.0002192s      (30ns/op)       (33333333 op/sec)

bolt bench
# Write 5.9925ms        (5.992µs/op)    (166889 op/sec)
# Read  1.0003363s      (30ns/op)       (33333333 op/sec)

go version
go version go1.6 windows/amd64

ver
Microsoft Windows [Version 10.0.10586]
@xiang90
Copy link
Contributor

xiang90 commented Feb 20, 2016

@djherbis Can you explain why do you think this is slow? Is this running on a HDD or SSD? If this is HDD, this is exact what you should expect since it limits by disk seek time.

@djherbis
Copy link
Author

My temp directory is on my SSD, so it looks like that's what I've been testing against.

@xiang90
Copy link
Contributor

xiang90 commented Feb 20, 2016

@djherbis

On my SSD and SSD on GCE, boltdb(current git master) works as expected

bolt bench --count 10000 --batch-size 1 --write-mode rnd
# Write 2.93937386s (293.937µs/op) (3402 op/sec)
# Read  1.000256226s    (34ns/op)   (29411764 op/sec)

I get similar result to yours when running on HDD. I would suggest you to check you disk first.

@djherbis
Copy link
Author

There's where my performance lands on the SSD on my mac as well.

I double-checked, my C: is my SSD and that's where the benchmark is being run.
What OS are you running on?

@xiang90
Copy link
Contributor

xiang90 commented Feb 20, 2016

@djherbis Linux/OS X

@djherbis
Copy link
Author

I'm running on Windows 10, I wonder if this might be OS specific...

I might have to pprof this when I get a chance. Can anyone with Windows + SSD check their results?

@djherbis
Copy link
Author

Just to verify that my drive wasn't the issue, I spun up a RAM disk and ran bolt bench on it.
Writes are about x2 slower on my RAM disk than your SSD. I'm going to guess this is an issue with memory mapped files on Windows.

bolt bench --count 10000 --batch-size 1 --write-mode rnd --path ./test.db
# Write 4.9689542s      (496.895µs/op)  (2012 op/sec)
# Read  1.0027516s      (28ns/op)       (35714285 op/sec)

@xiang90
Copy link
Contributor

xiang90 commented Feb 20, 2016

@djherbis https://github.com/ongardie/diskbenchmark maybe you can try this to bench the disk io.

boltdb does two fdatasyncs per writes I think. So you can get the ideal rough number if everything works well.

@djherbis
Copy link
Author

The script there is a bash file, I'm running windows. As I mentioned above though, I've switch to using a RAM disk and its still slower than my mac's SSD.

Here's svgs of the cpuprofiles of the mac vs. windows benchmarks:
bench_graphs.zip

Windows seems to spend a lot of time in cgocall and osyield vs. usleep and syscall on my mac.

fdatasync is definitely the bottleneck by the looks of those graphs.

@djherbis
Copy link
Author

I did a little research, fdatasync on Windows uses FlushFileBuffers:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa364439(v=vs.85).aspx

Due to disk caching interactions within the system, the FlushFileBuffers function can be inefficient when used after every write to a disk drive device when many writes are being performed separately. If an application is performing multiple writes to disk and also needs to ensure critical data is written to persistent media, the application should use unbuffered I/O instead of frequently calling FlushFileBuffers. To open a file for unbuffered I/O, call the CreateFile function with the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags. This prevents the file contents from being cached and flushes the metadata to disk with each write. For more information, see CreateFile.

I built Go from source, and added those flags to the Windows syscall.CreateFile,

// Extra flags for windows
const FILE_FLAG_NO_BUFFERING  = 0x20000000
const FILE_FLAG_WRITE_THROUGH = 0x80000000
h, e := CreateFile(pathp, access, sharemode, sa, createmode, FILE_ATTRIBUTE_NORMAL|FILE_FLAG_NO_BUFFERING|FILE_FLAG_WRITE_THROUGH, 0)

and commented out the file.Sync() call in bolt_windows.go

func fdatasync(db *DB) error {
    //return db.file.Sync()
    return nil
}

Here's my new output (on my Windows SSD):

bolt2 bench --count 10000 --batch-size 1 --write-mode rnd --path ./test.db
# Write 832.6103ms      (83.261µs/op)   (12010 op/sec)
# Read  1.0004619s      (29ns/op)       (34482758 op/sec)

Perhaps boltdb should create files with these flags in windows.

@xiang90
Copy link
Contributor

xiang90 commented Feb 20, 2016

@djherbis The result seems to be conflict with your experimental on RAM disk. If IO is an issue, why moving to RAM disk is even slower than a direct write?

@djherbis
Copy link
Author

@xiang90 Because its not the disk speed that's the problem, the limiting factor here is that FlushFileBuffers is inefficient when used after every write.

I believe that's because when you're going to flush everything to the disk after every write, its a waste of time to write it to the windows disk cache first. though this is speculation.

@xiang90
Copy link
Contributor

xiang90 commented Feb 20, 2016

Because its not the disk speed that's the problem,

Then why moving from disk to RAM disk does reduce the time from 5ms to 400us?

@djherbis
Copy link
Author

I didn't mean to say that the disk speed doesn't matter, since it does, as is evident by the benchmarks.

It's just that, more generally, all of the disks were suffering from poor FlushFileBuffers performance.

Also, there may still need to be more work to do. It appears that using the FILE_FLAG_NO_BUFFERING comes with caveats that you have to have to use aligned writes.

https://msdn.microsoft.com/en-us/library/windows/desktop/cc644950(v=vs.85).aspx

I'm not certain how this works with memory mapped files.

@djherbis
Copy link
Author

Looks like FILE_FLAG_NO_BUFFERING doesn't even work with memory mapped files.

So I guess I'm just seeing a performance boost because I'm not syncing to the physical media.

There is another method FlushViewOfFile which claims to flush memory mapped files, but it doesn't block for the data to be written and says to FlushFileBuffers to ensure that it gets written...

Maybe I'll experiment and see if calling it first speeds up FlushFileBuffers...

@djherbis djherbis changed the title Very poor write performance Very poor write performance [Windows FlushFileBuffers] Feb 21, 2016
@djherbis
Copy link
Author

djherbis commented Apr 8, 2016

Quick update:

Currently go test takes over 12 mins on my machine to run, which means I have to specify a custom timeout since go test normally gives up after 10.

Longest tests are:

TestBucket_Put_Single (463.69s)
TestOpen_Size_Large (112.25s)
TestBucket_Delete_FreelistOverflow (105.32s)

@benbjohnson
Copy link
Member

@djherbis We hit the same limit on the CI, drone.io. Some of the tests do long running randomized tests. You can run a subset of the tests by using the -short flag.

@djherbis
Copy link
Author

djherbis commented Apr 8, 2016

@benbjohnson Thanks, yeah I've used -short before which is great. The main issue is that I'm still having sizable performance issues on Windows, but I blame Windows for this since I feel its a limitation of FlushFileBuffers. Still hoping to find a more efficient way of syncing to the filesystem.

@enospc
Copy link

enospc commented Aug 30, 2017

@djherbis did you figure out the actual fix for the slow performance on windows?

@djherbis
Copy link
Author

@siginfo Unfortunately no :( I haven't looked much further into this since I last posted.

I'm still curious how other databases handle this issue on Windows, but I haven't had time to dig into it since I only use it for personal projects.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants