Skip to content

Commit

Permalink
Merge pull request #28 from nodejs/vfs-requirements
Browse files Browse the repository at this point in the history
Document the complete list of requirements for the VFS
  • Loading branch information
RaisinTen authored Sep 12, 2022
2 parents 49c5d2f + 2c7aec3 commit eac8775
Show file tree
Hide file tree
Showing 2 changed files with 156 additions and 0 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ Table of Contents

- [Existing SEA Solutions](./docs/existing-solutions.md)
- [Production Node.js CLIs](./docs/production-nodejs-clis.md)
- Requirements
- [Virtual File System](./docs/virtual-file-system-requirements.md)

Blog
----
Expand Down
154 changes: 154 additions & 0 deletions docs/virtual-file-system-requirements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
Virtual File System Requirements
================================

This document aims to list all the requirements of the Virtual File System.

# Supported

## Random access reads

The VFS must support random access reads just like any other real file system,
so that the read operations can be at least as fast as reading files from the
real file system.

## Symbolic links

This is critical for applications that want to use packages like [dugite][] that
attempt to download [Git executables][] that contain symlinks. Since
Electron's [ASAR][] does not support symlinks, including [dugite][] as a
dependency in an Electron app would expand every symlink into individual files,
thus significantly increase the package size which is not nice.

## Preserve the executable bit of the file permissions

It is important to preserve the executable bit of the file permissions, so that
it is possible for the single-executable to be able to execute only executable
files. Other than that, all the bundled files would be readable and none will be
writable.

## Preserve file-hierarchy information

A filesystem is incomplete without this because there's no way for the
single-executable to be able to access nested file paths.

## No interference with valid paths in the file system

If the bundled files in the VFS correspond to certain paths that already exist
in the real file system, that will break certain use-cases, so it should use
such paths that cannot be used by existing files.

Pkg uses [`/snapshot`](https://github.com/vercel/pkg#snapshot-filesystem) as the
prefix for all the embedded files. This is confusing if `/snapshot` is an
existing directory on the file system. Docker workflows routinely copy files to,
and run things at, the root of the filesystem, so following that approach too
would run into the same problem.

Boxednode allows users to enter a [namespace](https://github.com/mongodb-js/boxednode/blob/6326e3277469e8cfe593616a0ed152600a5f9045/README.md?plain=1#L69-L72)
and uses it like so:
```js
// Specify the entrypoint target name. If this is 'foo', then the resulting
// binary will be able to load the source file as 'require("foo/foo")'.
// This defaults to the basename of sourceFile, e.g. 'bar' for '/path/bar.js'.
namespace?: string;
```

A possible solution is to use the single executable path as the base path for
the files in the VFS, i.e., if the executable has `/a/b/sea` as the path and the
VFS contains a file named `file.txt`, it would be accessible by the application
using `/a/b/sea/file.txt`. This approach is similar to how Electron's [ASAR][]
works, i.e., if the application asar is placed in `/a/b/app.asar`, the
embedded `file.txt` file would use `/a/b/app.asar/file.txt` as the path.

## Globbing

`fs.statSync(process.execPath).isDirectory()` will return `true` and
`fs.statSync(process.execPath).isFile()` will return `false`. That way, if code
within the single-executable does naive globbing using an off-the-shelf glob
library, paths inside the VFS would also get picked up.

## Accept file paths in the VFS as arguments

If a single-executable formatter is run with an argument that is a path to a
file inside the VFS, it should be able to use the `fs` APIs to read, format and
print the formatted contents to `stdout`.

## Cross-platform tooling

The tooling required for archiving / extracting files into / from the VFS must
be available on all the [platforms supported by Node.js][].

## File path contents

Should not limit the size or the character contents of the file paths to stay as
close as possible to what a real file system provides.

## Case Sensitive

From Yarn's experience with zip, forcing case sensitivity within the archives
didn't break anything, improved consistency. By contrast, making the code case
insensitive would have increased the complexity, worsened the runtime
performance, increased the attack surface, for a use case that virtually no-one
cares about. Hence, the paths in the VFS will be case sensitive.

## Dynamic imports and requires

`require(require.resolve('./file.js'))` should work for files that are on the
real file system and the VFS.

## VFS path manipulation as strings and URL objects

If someone proposes that the VFS exist at a `vfs-file://` prefix, then this
might become an issue. `fs` APIs accept `URL` objects, but this means code in
(transitive) dependencies which assumes all native paths are strings may fail
when passed `URL` objects. Perhaps a (transitive) dependency uses
`require.resolve()`.

Using something like `vfs-file://` might be a potential solution for placing the
VFS contents somewhere that has no interference with valid paths in the file
system.

## Interaction with Native Addons

TODO: Still under discussion in https://github.com/nodejs/single-executable/discussions/29.

# Not supported

## No need for supporting write operations

Since the VFS is going to be embedded into the single-executable and also
protected by codesigning, making changes to the contents of the VFS should
invalidate the signature and crash the application if run again. Hence, no write
operation needs to be supported.

# Optionally support

## Increase locality of related files

For performance reasons.

## Format implementation in multiple languages

We want this format to already have implementation in *multiple* languages (not
just JS, since not all tools used in the JS ecosystem are written in JS), all
ideally production-grade and well-maintained.

## Consensus with third-party tools on building native integrations

We want this format to be consensual enough that third-party tools (VSCode,
emacs, ...) won't object to build native integrations with it (for instance,
Esbuild recently added zip support to integrate w/ Yarn's zip installs; it would
have been a much harder sell if Yarn had used a custom-made format).

## Optional data compression

As an application grows, bundling all the source code, dependencies and static
assets into a single file without compression would quickly reach the maximum
segment / file (depending on the embedding approach) size limit imposed by the
single executable file format / OS. A solution to this problem would be to
minify the JS source files but that might not be enough for other kinds of
files, so supporting data compression seems to be a better solution.

[ASAR]: https://github.com/electron/asar
[Git executables]: https://github.com/desktop/dugite-native/releases/
[dugite]: https://www.npmjs.com/package/dugite
[platforms supported by Node.js]: https://github.com/nodejs/node/blob/main/BUILDING.md#supported-platforms

0 comments on commit eac8775

Please sign in to comment.