Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

writestream should fsync(2) before close(2) #6438

Closed
polidore opened this issue Oct 30, 2013 · 19 comments
Closed

writestream should fsync(2) before close(2) #6438

polidore opened this issue Oct 30, 2013 · 19 comments
Labels

Comments

@polidore
Copy link

I am having an issue where data piped to a fs.WriteStream isn't flushed to disk. I think this is because of the way WriteStream is implemented-- it just calls close(2) when it gets a finish event, but it should fsync first. If you look at the man pages for close, this bit is what I think is causing my problem:

A successful close does not guarantee that the data has been successfully saved to disk, as the kernel defers writes. It is not common for a file system to flush the buffers when the stream is closed. If you need to be sure that the data is physically stored use fsync(2). (It will depend on the disk hardware at this point.)

On this line:

https://github.com/joyent/node/blob/master/lib/fs.js#L1610

It looks like the fs.writeStream calls close when it gets a finish from its source.

This is the function that will be called:

https://github.com/joyent/node/blob/master/lib/fs.js#L1542

And in this function, it is clear that node is calling fs.close (http://nodejs.org/api/fs.html#fs_fs_close_fd_callback) directly, which will cause the issue I describe above.

I think the response to the finish event should be changed to do an fsync first, then a close: http://nodejs.org/api/fs.html#fs_fs_fsync_fd_callback

I can put in a pull request for this, but I wanted to see if you guys agree with me first. This is only an issue for me on my production windows server, which has very large disk caches.

Thanks!

@bnoordhuis
Copy link
Member

That section of the man page only applies when the system crashes or powers down before the operating system gets a chance to flush the disk cache. It's highly unlikely that we'll change the default behavior but a patch that adds a 'fsyncOnClose' flag would be acceptable, I think. Should come with documentation, regression tests, etc.

@polidore
Copy link
Author

Are you sure about your interpretation of the man page? I don't think it reads that way. It's talking about a successful close.

Anyway, I will take a look at adding the flag you describe. For now, I'm working around this issue by subscribing to the finish event on my read stream with a call to fs.fsyncSync(writeStream.fd) before I call pipe. This works, but it's obviously not pretty!

@bnoordhuis
Copy link
Member

Are you sure about your interpretation of the man page? I don't think it reads that way. It's talking about a successful close.

On all the platforms that we support, the operating system's internal view of the file system (and therefore the view that's presented to user space) is always consistent unless there's been an interruption from outside, like a brownout.

Of course, it's possible for disk controllers to lie about whether the data has actually been written to disk. But in that case, fsync() won't help either.

@polidore
Copy link
Author

OK. It's odd that calling fsync as described fixes this. It could definitely be the way the driver is implemented on this server hardware. One more quote from man 3 close:

When there is an outstanding cancelable asynchronous I/O operation against fildes when close() is called, that I/O operation may be canceled.

I'm not sure when an async I/O operation is "cancelable", though.

@bnoordhuis
Copy link
Member

I'm not sure when an async I/O operation is "cancelable", though.

That's not really applicable to node.js. All file operations are synchronous but executed inside a thread pool. (Besides that, I don't think Windows AIO semantics quite follow POSIX AIO.)

@sam-github
Copy link

It's odd that calling fsync as described fixes this.

@polidore, you didn't actually say what problem you are having, or what fsync-before-close "fixed".

If you are closing the stream, and then power cycling your machine, as Ben says, fsync "might" help you (you are still at mercy of lower layers of i/o stack, particular drive controllers). But if after successful close, another process on the same o/s is NOT seeing the file as having the state completely written... that would be weird. And most likely a race condition between your two processes, not an o/s or node bug.

@polidore
Copy link
Author

Sam, sorry if I wasn't clear.

I'm doing something like this:

var fstream = fs.createWriteStream(savePath + file);
fstream.once('close',process.exit);
rstream.pipe(fstream);

My file will not have been flushed properly.

If i do:

var fstream = fs.createWriteStream(savePath + file);
fstream.once('finish',function() { fs.fsyncSync(fstream.fd)})
fstream.once('close',process.exit);
rstream.pipe(fstream);

It works properly.

I also tried just waiting 10 seconds after I get 'close', and that doesn't work. It seems like the read stream finishes and instantly closes the write stream and something prevents the data from being written to disk.

I'm using windows server 2008 with 64 bit node v0.10.20. I believe the drives are Hitachi SCSI drives.

@polidore
Copy link
Author

@bnoordhuis I didn't realize you weren't using the OS's async i/o. You're right, then, those comments from the man file aren't relevant.

@isaacs
Copy link

isaacs commented Oct 30, 2013

@polidore If the write callback is being fired before the write operation is completed, then I'd think that this is a bug lower in libuv, because the contract is for the write to have completed before calling the callback, which (as @bnoordhuis said) should always be guaranteed to be consistent unless there is a power outage.

@piscisaureus @sblom Is this a windows specific bug? I've never seen this behavior on Unix, and I'm sure that it would have caused some npm problems by now if it was a general fs bug. Eg, note the streams3 regression that made npm virtually unusuable due to file truncation problems. npm does not tolerate file truncation well! (Which is correct, and a useful crisp failure mode, of course.)

@polidore
Copy link
Author

One thing to note- this stream is opened and then closed in potentially
less than a millisecond. The script I wrote here is downloading a 30kb file
over gigabit Ethernet on a very, very fast server. Maybe the very short
duration contributes to the issue.

Thanks again for your attention here.
On Oct 30, 2013 6:28 PM, "Isaac Z. Schlueter" notifications@github.com
wrote:

@polidore https://github.com/polidore If the write callback is being
fired before the write operation is completed, then I'd think that this is
a bug lower in libuv, because the contract is for the write to have
completed before calling the callback, which (as @bnoordhuishttps://github.com/bnoordhuissaid) should always be guaranteed to be consistent unless there is a power
outage.

@piscisaureus https://github.com/piscisaureus @sblomhttps://github.com/sblomIs this a windows specific bug? I've never seen this behavior on Unix, and
I'm sure that it would have caused some npm problems by now if it was a
general fs bug. Eg, note the streams3 regression that made npm virtually
unusuable due to file truncation problems. npm does not tolerate file
truncation well! (Which is correct, and a useful crisp failure mode, of
course.)


Reply to this email directly or view it on GitHubhttps://github.com//issues/6438#issuecomment-27445220
.

@matthiasg
Copy link

This is an issue also on MacOS and SmartOS. We have an issue were we are waiting for the 'close' event and do a fs.renameSync in its callback which sometimes (but quite consistently seemingly based on file size) causes the files to be truncated on disk.

Adding a sleep helped in our case, but we are not happy of course.

adding fsyncSync(targetStream.fd) caused a Bad Argument exception since the fd is set to null before the callback is called (its a number at the beginning). So the only way to make it somewhat 'safe' is by adding a timeout.

this is on node: 0.10.30 on Windows 8 64 bit (ssd drives) and MacOS 10.9 and newest SmartOS release.

@polidore
Copy link
Author

polidore commented Sep 5, 2014

I just stopped using streams and it worked fine with regular callbacks.

On Friday, September 5, 2014, matthiasg notifications@github.com wrote:

This is an issue also on MacOS and SmartOS. We have an issue were we are
waiting for the 'close' event and do a sync rename in its callback which
sometimes (but quite consistently seemingly based on file size) causes the
files to be truncated on disk.

Adding a sleep helped in our case, but we are not happy of course.

adding fsyncSync(targetStream.fd) caused a Bad Argument exception since
the fd is set to null before the callback is called (its a number at the
beginning). So the only way to make it somewhat 'safe' is by adding a
timeout.

this is on node: 0.10.30 on Windows 8 64 bit (ssd drives) and MacOS 10.9
and newest SmartOS release.


Reply to this email directly or view it on GitHub
#6438 (comment).

@matthiasg
Copy link

What does that mean exactly just worked with callbacks ? Do you mean manually listening for data callbacks and calling write on that ?

I have to say that the different events published by the different streams are quite confusing though. Some streams raise finish, some end, some close .. In what order ? Do all streams do that? Do even custom streams implementing eg writable stream do that ?

That is in the sources of course, but the documentation is not straightforward and it does raise a number of questions and issues for consumers of streams.. Especially when they might not know about the exact stream type they are exposed to...

Cheers

-----Original Message-----
From: "Benjamin Polidore" notifications@github.com
Sent: ‎05.‎09.‎2014 15:16
To: "joyent/node" node@noreply.github.com
Cc: "matthiasg" mgt576@gmail.com
Subject: Re: [node] writestream should fsync(2) before close(2) (#6438)

I just stopped using streams and it worked fine with regular callbacks.

On Friday, September 5, 2014, matthiasg notifications@github.com wrote:

This is an issue also on MacOS and SmartOS. We have an issue were we are
waiting for the 'close' event and do a sync rename in its callback which
sometimes (but quite consistently seemingly based on file size) causes the
files to be truncated on disk.

Adding a sleep helped in our case, but we are not happy of course.

adding fsyncSync(targetStream.fd) caused a Bad Argument exception since
the fd is set to null before the callback is called (its a number at the
beginning). So the only way to make it somewhat 'safe' is by adding a
timeout.

this is on node: 0.10.30 on Windows 8 64 bit (ssd drives) and MacOS 10.9
and newest SmartOS release.


Reply to this email directly or view it on GitHub
#6438 (comment).


Reply to this email directly or view it on GitHub.=

@misterdjules
Copy link

@polidore I couldn't reproduce the issue you described, either on MacOS X or Windows 7. Here's the script I used:

var fs = require('fs');

var writeStream = fs.createWriteStream(__filename + '.copy');
var readStream = fs.createReadStream(__filename);

writeStream.once('close', process.exit);
readStream.pipe(writeStream);

The content of the file with the .copy extension was always identical to the content of the source file. Did I miss something?

@matthiasg Do you have a small test case that reproduces your issue? Regarding the stream documentation, I would encourage you to create a separate issue.

Thank you!

@matthiasg
Copy link

@misterdjules i am having stream issues such as this: #8488. We were downloading zip files from a http server and it was quite difficult to get the implementation stable enough to consistently download and store the entire file on disk (without missing some bytes sometimes)...

In the above case it is related to the interaction of http streams piped into filestreams. Still i would assume taking an http stream (downloading a binary) and piping that (using the official pipe method) into a fs.createWriteStream should NEVER leave files half finished without raising an error event (Unless the program or machine crashed of course).

@misterdjules
Copy link

Closing this in favor of #8488 for lack of information and because it doesn't seem to be an issue for the original reporter. Please let us know if you think this issue is still relevant.

@quyin
Copy link

quyin commented Jul 8, 2015

@polidore could you explain what you mean by "stopped using streams and it worked fine with regular callbacks"? I think I'm having a similar issue here and still trying to figure out a workaround. thanks!

@donaldpipowitch
Copy link

Dito. It looks like we have a similar problem, too. Leads to this:

ERR! build [ { [Error: ENOENT: no such file or directory, chmod '/foo.js']
ERR! build     errno: -2,
ERR! build     code: 'ENOENT',
ERR! build     syscall: 'chmod',
ERR! build     path: '/foo.js' } ]

@donaldpipowitch
Copy link

We use something like this for copying files now:

function copyByStream(src, dest) {
  return new Promise((resolve, reject) => {
    let rs = fs.createReadStream(src);
    let ws = fs.createOutputStream(dest);

    rs.once('error', reject);
    ws.once('error', reject);

    ws.once('open', fd => {
      ws.once('close', async () => {
        fs.fsync(fd, () => {
          resolve();
        });
      });
    });

    rs.pipe(ws);
  });
}

No errors anymore for us.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

8 participants