-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] fs: add FD object to manage file descriptors #18109
Conversation
Note: None of the existing fs APIs are modified to use this new object. To use the new object with, for instance, |
src/node_file.cc
Outdated
// occurring while cleaning up. That is not necessarily a good thing, | ||
// but as this is typically happening during GC, there's not a lot we | ||
// can reasonably do here. | ||
int ret = uv_fs_close(env()->event_loop(), &req, fd_, nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take note: this uses a sync close. If we decide that it's necessary to use an async close, we can switch to that.
There is an open question about what to do with the result. We can use the new env->SetImmediate()
to schedule a callback out to javascript land after the close attempt in order to report any errors or to give the equivalent to fs.close(fd, callback)
, but given that the primary use case is for this to close on garbage collection, I did not consider adding that mechanism to be critical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is an error, will the file descriptor be leaked? Can we assure that we have no leak before swallowing?
This mechanism implies that we are cleaning up after ourselves whatever happens.
IMHO it should go to ‘uncaughtException’ if it is failing, and he OS fd is still there.
doc/api/fs.md
Outdated
Instances of the `fs.FD` object are created internally by the `fs.openFD()` | ||
and `fs.openFDSync()` methods. | ||
|
||
### fs.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be fd.close()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
Tbh, I don’t like the idea of exposing this as a public API. The JS engine is not aware about file descriptors being a (comparatively) scarce resource, so the garbage collector might not act on the need for cleaning these up; closing the FD once its usage is finished is still the right thing to do. I would definitely approve of this feature being enabled for |
doc/api/fs.md
Outdated
* `mode` {integer} **Default:** `0o666` | ||
* `callback` {Function} | ||
* `err` {Error} | ||
* `fd` {fs.FD} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be something like {[fs.FD][#fs_class_fs_fd]}
or {[fs.FD][]}
with a bottom reference?
doc/api/fs.md
Outdated
the potentially stale local cache. It has a very real impact on I/O | ||
performance so using this flag is not recommended unless it is needed. | ||
|
||
Note that this doesn't turn `fs.open()` into a synchronous blocking call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open()
-> fs.openFD()
doc/api/fs.md
Outdated
performance so using this flag is not recommended unless it is needed. | ||
|
||
Note that this doesn't turn `fs.open()` into a synchronous blocking call. | ||
If synchronous operation is desired `fs.openSync()` should be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.openSync()
-> fs.openFDSync()
doc/api/fs.md
Outdated
The kernel ignores the position argument and always appends the data to | ||
the end of the file. | ||
|
||
*Note*: The behavior of `fs.open()` is platform-specific for some flags. As |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open()
-> fs.openFD()
doc/api/fs.md
Outdated
|
||
```js | ||
// macOS and Linux | ||
fs.open('<directory>', 'a+', (err, fd) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open(
-> fs.openFD(
doc/api/fs.md
Outdated
}); | ||
|
||
// Windows and FreeBSD | ||
fs.open('<directory>', 'a+', (err, fd) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open(
-> fs.openFD(
doc/api/fs.md
Outdated
a colon, Node.js will open a file system stream, as described by | ||
[this MSDN page][MSDN-Using-Streams]. | ||
|
||
Functions based on `fs.open()` exhibit this behavior as well. eg. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open()
-> fs.openFD()
? Is this note still applicable here?
doc/api/fs.md
Outdated
`fs.writeFile()`, `fs.readFile()`, etc. | ||
|
||
*Note:* On Windows, opening an existing hidden file using the `w` flag (either | ||
through `fs.open()` or `fs.writeFile()`) will fail with `EPERM`. Existing hidden |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open()
-> fs.openFD()
?
doc/api/fs.md
Outdated
* `path` {string|Buffer|URL} | ||
* `flags` {string|number} | ||
* `mode` {integer} **Default:** `0o666` | ||
* Returns: {fd.FD} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as above?
doc/api/fs.md
Outdated
* `mode` {integer} **Default:** `0o666` | ||
* Returns: {fd.FD} | ||
|
||
Synchronous version of [`fs.open()`][]. Returns an `fs.FD` object representing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.open()
-> fs.openFD()
with a new link?
I debugged too many leaked file descriptor situations because Node did not have this feature. I think fd manipulation is somehing very few are doing directly. For those that want to do that, this would simplify things considerably. For everybody else, it changes nothing. I’m +1. Can you also use it within streams in a separate commit in the PR? I think it will help making a case for it. Can we add something similar for net as well? Can it be made generic? |
I’ve thought about these quite a bit in the past, and the thing is, that does not work (for readable streams). These objects need to be GC roots, because unfortunately there is no reason why any For example, this code is perfectly valid (and common) releasing the stream for garbage collection would break it – and worst of all, in a non-deterministic way: fs.createReadStream('file.txt')
.on('data', ondata).on('end', end);
If that is the goal here, I would propose:
I think I could get on board with that. |
@addaleax the problem for file descriptor leaking in streams comes when you do I think the stream reference is kept alive by the closure created within |
There isn’t always a |
In the example you made, with In any case, if there is not a read scheduled and no one is holding a reference, then a file descriptor will be leaked. |
Garbage collection is non-deterministic. Basing deterministic resource management on something that isn't is a terrible idea of the you-don't-need-this-even-if-you-think-you-do variety. If the streams API has problems, they should be fixed, but not like this. |
This is not about basing deterministic resource management on gc. This is about preventing silent file descriptor leaks. |
A tool for detecting file descriptor leaks is a worthy goal but we can do much better than this PR. Just of the top off my head:
Best of all: can be done as a third-party module, doesn't need to be built-in. |
@addaleax said:
There is already a defined Web API for a
We could but I'm not a huge fan of that. The existing fs API, for all of it's warts, is familiar and well established. The promises based version should really try not to diverge from it too much in order to make code migration less painful. Eventually we might get there, and this would give us a foundation to work from, but I'd generally prefer not. What I can do right now in order to make that easier, however, is have the public API expose a pure JS object with the internal native @bnoordhuis said:
As @mcollina indicates, this is not about replacing or eliminating explicitly calling I like the idea of emitting a warning if the FWIW, I knew this bit was going to be controversial which is why I separated it out into it's own PR. There's no need to rush this part and I'd like to get it right. |
Okay, fair point – my line of thinking was, file descriptors are not necessarily a concept that JS programmers are familiar with, so we might want to provide a more explicit identifier than
We could make the methods on the new class Promises-only. 😉 But I’m also okay with saying “this is userland material” for that. |
Ok, so here's what I'm looking at for changes to this:
How does that look? |
That, to me, sounds pretty reasonable. |
Will adding a dual callback/promise system for |
@mcollina ... there won't be a dual callback. The |
Ok, I have points 1 and 2 implemented. I do not yet have point 3 implemented (failing with an unhandled error if closing during garbage collection). I will work on that next, but please take a look in the meantime |
doc/api/fs.md
Outdated
@@ -2069,6 +2107,117 @@ through `fs.open()` or `fs.writeFile()`) will fail with `EPERM`. Existing hidden | |||
files can be opened for writing with the `r+` flag. A call to `fs.ftruncate()` | |||
can be used to reset the file contents. | |||
|
|||
## fs.openFD(path, flags[, mode], callback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this return a Promise
instead of returning a callback? the rest of the code is promise-based. It should be either 100% callbacks or 100% promises, having a callback here but fd.close().then()
is not consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm going back and forth on this and thinking that rather than introduce fs.openFD()
and fs.openFDSync()
we just introduce a Promise returning fs.promises.openFD()
and keep this isolated to use of the Promises API for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if that is the approach to take on this, then I'll just close this PR and bundle this commit in with the one that introduces the initial broader set of promisified functions
Ok, updated the implementation on this to hit all three of the points with one exception: if close fails during the garbage collection, a This does not yet address @mcollina's last point about the ping @mcollina , @addaleax, @bnoordhuis |
src/node_file.cc
Outdated
Local<Promise> promise = promise_.Get(env_->isolate()); | ||
Local<Promise::Resolver> resolver = promise.As<Promise::Resolver>(); | ||
resolver->Resolve(env_->context(), Undefined(env_->isolate())); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: if this stuff becomes more common, it may make sense to drop a couple of new utility functions in Environment
... specifically... env->Resolve(promise, value)
and env->Reject(promise, value)
(and various reject variants like, env->RejectUVException(promise, err, syscall, ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am LGTM on all but the fs.openFD(callback)
API.
If it's callback-based, it should all be callback based. If it's promise-based, it should all be promise based.
I propose we keep this into fs.promise
, and then we add an equivalent callback-based concept if it is needed.
@jasnell would it be better to open up a separate repo for this? So we can review the individual changes?
doc/api/fs.md
Outdated
@@ -291,6 +291,44 @@ explicitly synchronous use libuv's threadpool, which can have surprising and | |||
negative performance implications for some applications, see the | |||
[`UV_THREADPOOL_SIZE`][] documentation for more information. | |||
|
|||
## class: fs.FD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d still go by my earlier point that this should ideally not be called FD
… File
might already be taken but I’d still kinda prefer something that would be more intuitive to people who haven’t done much UNIX/systems programming…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to suggestions :) I'm not good at the naming of things lol. Once had a cat named Cat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nodejs/collaborators Ideas? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FileDescriptor
, FileHandle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go with FileHandle
src/node_file.cc
Outdated
CHECK(persistent().IsEmpty()); | ||
if (!object().IsEmpty()) | ||
ClearWrap(object()); | ||
persistent().Reset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last three lines are no-ops (assuming the CHECK(persistent().IsEmpty())
does not crash)
src/node_file.cc
Outdated
int ret = uv_fs_close(env()->event_loop(), &req, fd_, nullptr); | ||
uv_fs_req_cleanup(&req); | ||
|
||
struct uv_err_detail { int ret; int fd; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d prefer the uv_
namespace to remain reserved for libuv, otherwise it gets hard to tell which definitions are coming from where … also, it seems like you could just pass the message string as the data
argument for the immediate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the warning, sure, but not raising the UVException, which requires the errcode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. Sorry!
// This exception will end up being fatal for the process because | ||
// it is being thrown from within the SetImmediate handler and | ||
// there is no JS stack to bubble it to. In other words, tearing | ||
// down the process is the only reasonable thing we can do here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not tearing down the process, it could still invoke uncaughtException
handlers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm will have to test and verify that. I don't know if the uncaughtException handler will get triggered in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok yeah, uncaughtException
does not catch the FatalException here. The process just dies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it prints the error at least, then dies :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ ./node --expose-gc ~/tmp/test.js
undefined:0
Error: EBADF: Closing file descriptor 12 on garbage collection failed, close
src/node_file.h
Outdated
// Synchronous close that emits a warning | ||
inline void Close(); | ||
|
||
class CloseReq { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should inherit from ReqWrap<uv_fs_t>
, like e.g. FSReqWrap
does (or just inherit from the latter to begin with, that may or may not be easier)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, considered that but it makes it a bit more heavyweight than it really needs to be. Shouldn't be a problem tho.
src/node_file.cc
Outdated
HandleScope scope(env_->isolate()); | ||
Local<Promise> promise = promise_.Get(env_->isolate()); | ||
Local<Promise::Resolver> resolver = promise.As<Promise::Resolver>(); | ||
resolver->Reject(env_->context(), reason); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is async context propagated here? I can’t see anything? (same for resolve)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know less than I need to on that part. What needs to be added here to propagate that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasnell Once CloseReq
is an ReqWrap
, you’ll want a line like InternalCallbackScope callback_scope(this);
before the resolve calls. That’s all. :)
(The idea is that the CloseReq
is an AsyncWrap
then, so it tracks state automatically, and you only need to enter a scope to make async_hooks listeners aware that things are happening in the CloseReq
’s async context.)
src/node_file.cc
Outdated
@@ -211,6 +353,16 @@ void AfterInteger(uv_fs_t* req) { | |||
req_wrap->Resolve(Integer::New(req_wrap->env()->isolate(), req->result)); | |||
} | |||
|
|||
void AfterFD(uv_fs_t* req) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you call this e.g. AfterOpenFD
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, no problem
src/node_file.cc
Outdated
CHECK_NE(*path, nullptr); | ||
|
||
int flags = args[1]->Int32Value(); | ||
int mode = static_cast<int>(args[2]->Int32Value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the non-deprecated overloads of Int32Value()
, or just .As<Int32>()->Value()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Will make that change in Open() also
@mcollina ... understood on the JS api.. my plan is to get the basic implementation right for the C++ class here in this PR, but I'm going to move the actual commit into the FS promises PR along with everything else and not land this PR as is. |
Also, I don't think a separate repo is necessary. The changes are actually quite discreet are are no where near as massive as something like the http2 impl. One more PR should have the majority FS promise APIs implemented. |
Currently, if any env->SetImmediate callback uses any of the env->Throw{Whatever} variants, the exception is lost and nothing happens. This commit catches those with a v8::TryCatch and does a FatalException to tear down the process.
In preparation for providinng a promisified fs API, introduce a FD wrapper class that automatically closes the file descriptor on garbage collection, helping to ensure that the fd will not leak. The object should still be explicitly closed, and a process warning will be emitted if it is closed on gc (a fatal exception will occur if closing during gc fails). The promisified fs API will use these objects instead of the numeric file descriptors in order to ensure that fd's are closed appropriately following promise resolve or reject operations.
@addaleax ... updated :-) |
} | ||
} else { | ||
// Already closed. Just reject the promise immediately | ||
resolver->Reject(context, UVException(isolate, UV_EBADF, "close")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax ... do we also need an InternalCallbackScope here? This is not using the CloseReq
Closing in favor of #18297 |
In preparation for providinng a promisified fs API, introduce a FD wrapper class that automatically closes
the file descriptor on garbage collection, helping to ensure that the fd will not leak.
The promisified fs API will use these objects instead of the numeric file descriptors in order to ensure that fd's are closed appropriately following promise resolve or reject operations.
This was extracted from the WIP fs promises PR: #17739
/cc @addaleax @mcollina
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
fs