Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: readdir optionally returning type information #22020

Closed
wants to merge 10 commits into from
101 changes: 99 additions & 2 deletions doc/api/fs.md
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,92 @@ synchronous use libuv's threadpool, which can have surprising and negative
performance implications for some applications. See the
[`UV_THREADPOOL_SIZE`][] documentation for more information.

## Class: fs.Dirent
<!-- YAML
added: REPLACEME
-->

When [`fs.readdir()`][] or [`fs.readdirSync()`][] is called with the
`withFileTypes` option set to `true`, the resulting array is filled with
`fs.Dirent` objects, rather than strings or `Buffers`.

### dirent.isBlockDevice()
<!-- YAML
added: REPLACEME
-->

* Returns: {boolean}

Returns `true` if the `fs.Dirent` object describes a block device.

### dirent.isCharacterDevice()
<!-- YAML
added: REPLACEME
-->

* Returns: {boolean}

Returns `true` if the `fs.Dirent` object describes a character device.

### dirent.isDirectory()
<!-- YAML
added: REPLACEME
-->

* Returns: {boolean}

Returns `true` if the `fs.Dirent` object describes a file system
directory.

### dirent.isFIFO()
<!-- YAML
added: REPLACEME
-->

* Returns: {boolean}

Returns `true` if the `fs.Dirent` object describes a first-in-first-out
(FIFO) pipe.

### dirent.isFile()
<!-- YAML
added: REPLACEME
-->

* Returns: {boolean}

Returns `true` if the `fs.Dirent` object describes a regular file.

### dirent.isSocket()
<!-- YAML
added: REPLACEME
-->

* Returns: {boolean}

Returns `true` if the `fs.Dirent` object describes a socket.

### dirent.isSymbolicLink()
<!-- YAML
added: REPLACEME
-->

* Returns: {boolean}

Returns `true` if the `fs.Dirent` object describes a symbolic link.


### dirent.name
<!-- YAML
added: REPLACEME
-->

* {string|Buffer}

The file name that this `fs.Dirent` object refers to. The type of this
value is determined by the `options.encoding` passed to [`fs.readdir()`][] or
[`fs.readdirSync()`][].

## Class: fs.FSWatcher
<!-- YAML
added: v0.5.8
Expand Down Expand Up @@ -2319,9 +2405,10 @@ changes:
* `path` {string|Buffer|URL}
* `options` {string|Object}
* `encoding` {string} **Default:** `'utf8'`
* `withFileTypes` {boolean} **Default:** `false`
* `callback` {Function}
* `err` {Error}
* `files` {string[]|Buffer[]}
* `files` {string[]|Buffer[]|fs.Dirent[]}

Asynchronous readdir(3). Reads the contents of a directory.
The callback gets two arguments `(err, files)` where `files` is an array of
Expand All @@ -2332,6 +2419,9 @@ object with an `encoding` property specifying the character encoding to use for
the filenames passed to the callback. If the `encoding` is set to `'buffer'`,
the filenames returned will be passed as `Buffer` objects.

If `options.withFileTypes` is set to `true`, the `files` array will contain
[`fs.Dirent`][] objects.

## fs.readdirSync(path[, options])
<!-- YAML
added: v0.1.21
Expand All @@ -2345,7 +2435,8 @@ changes:
* `path` {string|Buffer|URL}
* `options` {string|Object}
* `encoding` {string} **Default:** `'utf8'`
* Returns: {string[]} An array of filenames excluding `'.'` and `'..'`.
* `withFileTypes` {boolean} **Default:** `false`
* Returns: {string[]|Buffer[]|fs.Dirent[]}

Synchronous readdir(3).

Expand All @@ -2354,6 +2445,9 @@ object with an `encoding` property specifying the character encoding to use for
the filenames returned. If the `encoding` is set to `'buffer'`,
the filenames returned will be passed as `Buffer` objects.

If `options.withFileTypes` is set to `true`, the result will contain
[`fs.Dirent`][] objects.

## fs.readFile(path[, options], callback)
<!-- YAML
added: v0.1.29
Expand Down Expand Up @@ -4637,6 +4731,7 @@ the file contents.
[`WriteStream`]: #fs_class_fs_writestream
[`EventEmitter`]: events.html
[`event ports`]: http://illumos.org/man/port_create
[`fs.Dirent`]: #fs_class_fs_dirent
[`fs.FSWatcher`]: #fs_class_fs_fswatcher
[`fs.Stats`]: #fs_class_fs_stats
[`fs.access()`]: #fs_fs_access_path_mode_callback
Expand All @@ -4652,6 +4747,8 @@ the file contents.
[`fs.mkdtemp()`]: #fs_fs_mkdtemp_prefix_options_callback
[`fs.open()`]: #fs_fs_open_path_flags_mode_callback
[`fs.read()`]: #fs_fs_read_fd_buffer_offset_length_position_callback
[`fs.readdir()`]: #fs_fs_readdir_path_options_callback
[`fs.readdirSync()`]: #fs_fs_readdirsync_path_options
[`fs.readFile()`]: #fs_fs_readfile_path_options_callback
[`fs.readFileSync()`]: #fs_fs_readfilesync_path_options
[`fs.realpath()`]: #fs_fs_realpath_path_options_callback
Expand Down
23 changes: 19 additions & 4 deletions lib/fs.js
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ const { getPathFromURL } = require('internal/url');
const internalUtil = require('internal/util');
const {
copyObject,
Dirent,
getDirents,
getOptions,
nullCheck,
preprocessSymlinkDestination,
Expand Down Expand Up @@ -773,8 +775,19 @@ function readdir(path, options, callback) {
validatePath(path);

const req = new FSReqCallback();
req.oncomplete = callback;
binding.readdir(pathModule.toNamespacedPath(path), options.encoding, req);
if (!options.withFileTypes) {
req.oncomplete = callback;
} else {
req.oncomplete = (err, result) => {
if (err) {
callback(err);
return;
}
getDirents(path, result, callback);
};
}
binding.readdir(pathModule.toNamespacedPath(path), options.encoding,
!!options.withFileTypes, req);
}

function readdirSync(path, options) {
Expand All @@ -783,9 +796,10 @@ function readdirSync(path, options) {
validatePath(path);
const ctx = { path };
const result = binding.readdir(pathModule.toNamespacedPath(path),
options.encoding, undefined, ctx);
options.encoding, !!options.withFileTypes,
undefined, ctx);
handleErrorFromBinding(ctx);
return result;
return options.withFileTypes ? getDirents(path, result) : result;
}

function fstat(fd, options, callback) {
Expand Down Expand Up @@ -1819,6 +1833,7 @@ module.exports = fs = {
writeFileSync,
write,
writeSync,
Dirent,
Stats,

get ReadStream() {
Expand Down
12 changes: 10 additions & 2 deletions lib/internal/fs/promises.js
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ const { getPathFromURL } = require('internal/url');
const { isUint8Array } = require('internal/util/types');
const {
copyObject,
getDirents,
getOptions,
getStatsFromBinding,
nullCheck,
Expand All @@ -37,10 +38,13 @@ const {
validateUint32
} = require('internal/validators');
const pathModule = require('path');
const { promisify } = require('internal/util');

const kHandle = Symbol('handle');
const { kUsePromises } = binding;

const getDirectoryEntriesPromise = promisify(getDirents);

class FileHandle {
constructor(filehandle) {
this[kHandle] = filehandle;
Expand Down Expand Up @@ -312,8 +316,12 @@ async function readdir(path, options) {
options = getOptions(options, {});
path = getPathFromURL(path);
validatePath(path);
return binding.readdir(pathModule.toNamespacedPath(path),
options.encoding, kUsePromises);
const result = await binding.readdir(pathModule.toNamespacedPath(path),
options.encoding, !!options.withTypes,
kUsePromises);
return options.withFileTypes ?
getDirectoryEntriesPromise(path, result) :
result;
}

async function readlink(path, options) {
Expand Down
117 changes: 116 additions & 1 deletion lib/internal/fs/utils.js
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ const {
const { isUint8Array } = require('internal/util/types');
const pathModule = require('path');
const util = require('util');
const kType = Symbol('type');
const kStats = Symbol('stats');

const {
O_APPEND,
Expand All @@ -31,24 +33,135 @@ const {
S_IFREG,
S_IFSOCK,
UV_FS_SYMLINK_DIR,
UV_FS_SYMLINK_JUNCTION
UV_FS_SYMLINK_JUNCTION,
UV_DIRENT_UNKNOWN,
UV_DIRENT_FILE,
UV_DIRENT_DIR,
UV_DIRENT_LINK,
UV_DIRENT_FIFO,
UV_DIRENT_SOCKET,
UV_DIRENT_CHAR,
UV_DIRENT_BLOCK
} = process.binding('constants').fs;

const isWindows = process.platform === 'win32';

let fs;
function lazyLoadFs() {
if (!fs) {
fs = require('fs');
}
Copy link
Member

@jdalton jdalton Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we verify that fs isn't already loaded before this. 👆

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal/fs/utils is required by fs, so the dependency is circular. That's why lazily loading it is necessary.

return fs;
}

function assertEncoding(encoding) {
if (encoding && !Buffer.isEncoding(encoding)) {
throw new ERR_INVALID_OPT_VALUE_ENCODING(encoding);
}
}

class Dirent {
constructor(name, type) {
this.name = name;
this[kType] = type;
}

isDirectory() {
return this[kType] === UV_DIRENT_DIR;
}

isFile() {
return this[kType] === UV_DIRENT_FILE;
}

isBlockDevice() {
return this[kType] === UV_DIRENT_BLOCK;
}

isCharacterDevice() {
return this[kType] === UV_DIRENT_CHAR;
}

isSymbolicLink() {
return this[kType] === UV_DIRENT_LINK;
}

isFIFO() {
return this[kType] === UV_DIRENT_FIFO;
}

isSocket() {
return this[kType] === UV_DIRENT_SOCKET;
}
}

class DirentFromStats extends Dirent {
constructor(name, stats) {
super(name, null);
this[kStats] = stats;
}
}

for (const name of Reflect.ownKeys(Dirent.prototype)) {
if (name === 'constructor') {
continue;
}
DirentFromStats.prototype[name] = function() {
return this[kStats][name]();
};
}

function copyObject(source) {
var target = {};
for (var key in source)
target[key] = source[key];
return target;
}

function getDirents(path, [names, types], callback) {
var i;
if (typeof callback == 'function') {
const len = names.length;
let toFinish = 0;
for (i = 0; i < len; i++) {
const type = types[i];
if (type === UV_DIRENT_UNKNOWN) {
const name = names[i];
const idx = i;
toFinish++;
lazyLoadFs().stat(pathModule.resolve(path, name), (err, stats) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, it’s a surprise for me.

If I understand correctly, it says here that if we do not know what type of current entry – call fs.stat.

But what if I don't want and expect such behavior? For example, I want to collect unknown entries and do something?

Copy link

@mrmlnc mrmlnc Aug 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want to have this behavior?

At a minimum, for me, DirEntry is associated with entry information, including information that the type is unknown for the current entry.

It's confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior was specifically asked for by @addaleax in this comment: #22020 (comment)

isUnknown() will still return true for entries whose type information was retrieved via fs.stat, so for the use case you're describing, information is not lost.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, in this case, the isUnknown method is not described in the documentation (doc/api/fs.md Dirent class).

But, unfortunately, I still don't understand why the low-level tool has an implicit action.

For example, I intended to make additional logic for unknown-entries in the package:

Read the directory. Get a set of entries. If the user wants to see fs.Stats for entries, then we get it.

In the current situation we will receive the fs.Stats twice (no big deal) if the system doesn't give us types to entries. I understand that this is a degenerate example.

P.S.: That's interesting, can the same system provide a type for some entries, and for some not.

❤️ In any case, thank you for your work!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, in this case, the isUnknown method is not described in the documentation (doc/api/fs.md Dirent class).

That's an mistake on my part. Whoops! I'll add it in.

But, unfortunately, I still don't understand why the low-level tool has an implicit action.

The intent (I think, based on the comment by @addaleax) is to actually have directory entry type information in as many cases as possible, and not depend on the user code to perform an extra action to get the information they asked for in the first place in the event of an easily recoverable bad case (i.e. isUnknown). This seems reasonable to me.

In the current situation we will receive the fs.Stats twice (no big deal) if the system doesn't give us types to entries. I understand that this is a degenerate example.

Perhaps a future optimization PR could cache the results of the first stat call? (Maybe clearing that cache on the next tick?)

P.S.: That's interesting, can the same system provide a type for some entries, and for some not.

Excellent question! To be honest, I'm not really sure what the exhaustive list of situations is, in which isUnknown returns true. I do know that it seemed to show up reliably in CI on SmartOS and AIX for entries that I knew were ordinary files, and fs.stat works just fine on them. I'm basically just going with what libuv reports.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrmlnc Sorry I looked over my code again and it seems that my intention was actually to remove the isUnknown method, because it's now unnecessary, since the underlying fs.stat call will make the type always knowable. I think that's the best path to take here, so I'll go ahead and remove isUnknown from the implementation and the test file.

That being said, if you do find yourself in a situation where you want to know whether fs.stat has been called internally, note that the Dirent instances in that case will actually be instances of DirentFromStats, an undocumented subclass I used in order to make it easy to have the methods from Stats fill in the missing data. So you can always detect that if you really need that information.

Sorry for the confusion! It's getting late here (UTC-7), haha.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I agree with the current implementation. Thanks!

if (err) {
callback(err);
return;
}
names[idx] = new DirentFromStats(name, stats);
if (--toFinish === 0) {
callback(null, names);
}
});
} else {
names[i] = new Dirent(names[i], types[i]);
}
}
if (toFinish === 0) {
callback(null, names);
}
} else {
const len = names.length;
for (i = 0; i < len; i++) {
const type = types[i];
if (type === UV_DIRENT_UNKNOWN) {
const name = names[i];
const stats = lazyLoadFs().statSync(pathModule.resolve(path, name));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that was asked before (I didn't read through all comments): why does this use resolve instead of join? At this point name is not expected to be an absolute path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this use statSync instead of lstatSync? This makes it rather inconsistent because in one case a Dirent may be reported as symbolic link and in case readdir gives type === UV_DIRENT_UNKNOWN symlinks are resolved and the type of the symlinked file is used instead.

names[i] = new DirentFromStats(name, stats);
} else {
names[i] = new Dirent(names[i], types[i]);
}
}
return names;
}
}

function getOptions(options, defaultOptions) {
if (options === null || options === undefined ||
typeof options === 'function') {
Expand Down Expand Up @@ -342,6 +455,8 @@ function validatePath(path, propName = 'path') {
module.exports = {
assertEncoding,
copyObject,
Dirent,
getDirents,
getOptions,
nullCheck,
preprocessSymlinkDestination,
Expand Down
Loading