Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New File System Implementation #15041

Closed
ethanalee opened this issue Sep 14, 2021 · 70 comments
Closed

New File System Implementation #15041

ethanalee opened this issue Sep 14, 2021 · 70 comments
Assignees

Comments

@ethanalee
Copy link
Collaborator

ethanalee commented Sep 14, 2021

Objective: Write a high-performance multi-threaded file system layer that replaces the current implementation in library_fs.js.

Goals:

  • Faster file system operations
  • Multi-threaded without relying on proxying
  • Smaller code size footprint
  • Per-file persistence
  • Support multiple async and sync backends
  • Support for large files
  • POSIX compatibility via syscalls

Design Sketch Link

Click here to view/comment

@ethanalee ethanalee self-assigned this Sep 14, 2021
@ethanalee ethanalee changed the title File system Rewrite New File System Implementation Sep 14, 2021
@kripken
Copy link
Member

kripken commented Sep 14, 2021

Thanks @ethanalee !

I'd add

(no need to actually edit the top comment).

@hcldan
Copy link

hcldan commented Sep 14, 2021

I'd like to help however I can in moving this forward.

@kripken
Copy link
Member

kripken commented Sep 14, 2021

@hcldan Help would certainly be appreciated here!

For now, it's still early for writing code, but we've been reading library_fs.js and related files in order to understand the current capabilities, so that we know what not to regress. If you or others want to help later, reading that code now might end up useful.

Another issue is testing. Reviewing the current tests and seeing if there is any lack of coverage would be good, though that might not be easy to do. One possible way to combine this with the previous point is to find interesting/important areas in library_fs.js and then to see if intentionally breaking them (adding a throw, or returning a wrong value, etc.) causes a test to break - if no test breaks, that's missing coverage. (For doing that, testing ./tests/runner.py wasm0 other should be enough.)

@curiousdannii
Copy link
Contributor

Emscripten's FS shouldn't require multithreading though.

@kripken
Copy link
Member

kripken commented Sep 15, 2021

@curiousdannii

Yes, definitely!

The singlethreaded case should continue to behave as it has been. Perhaps it will be smaller, but that's about it.

The multithreaded case should get a lot faster (by avoiding constant proxying to the main thread all the time).

@hcldan
Copy link

hcldan commented Sep 16, 2021

I've been looking at library_idbfs.js as it shows a way to do async storage operations... does anyone have any experience with this that I could chat with? I'd like to understand better what's going on here.

It looks like FS is a MEMFS that lives in the main browser thread, and we rely on these syncfs calls to keep the data current in the memfs as it's stored in the indexeddb. But that just seems to only take care of persistence, and not actually offload the storage to disk and out of memory... am I correct here?

Right now the system calls in emscripten proxy to the main ui thread (with pthreads)... that means these calls are already async... Where are they async? Can we push that async interface down to the FS layer... and then perhaps shim in the same async adaptation that's being done for proxying in the case where it's a single threaded app?

I started all this because the SF apis dont have sync counterparts in the main browser thread, and I can plug in the async SF apis inside the library_idbfs.js and put data in FS and metadata in idbfs... but that won't help us if all the file data stays in memory.

sidebar: also looking for help understanding this module architecture... any good place to start looking?
idbfs impl looks entirely different in surface area than the SFfs module... how are things hooked up? It's very confusing.

@hcldan
Copy link

hcldan commented Sep 16, 2021

I think I found some of the answers to my question here: https://github.com/emscripten-core/emscripten/blob/main/system/lib/pthread/library_pthread.c#L485

@hcldan
Copy link

hcldan commented Sep 16, 2021

would always proxying fs calls deadlock a single threaded program? I don't know how this wait works.
I think if we could turn the fs proxy always on, and make FS naturally async, it would give us more options. perhaps if the proxy layer for fs were moved into the fs module itself.

I'm having a hard time grokking the pthread code that looks at __proxy. it's a lot of code writing code.
I can't tell if it's general purpose or custom built to only work with syscalls

@hcldan
Copy link

hcldan commented Sep 16, 2021

As far as requirements go:

  • Do not keep FS data in memory, delegate all IO to mounted filesystems. (eliminate syncfs)
    • FS metadata, might be fine, as I think SF doesn't have a good way to implement file locking, readonly/readwrite, and opening a file more than once.

@hcldan
Copy link

hcldan commented Sep 16, 2021

I actually think it might be a nice IDBFS enhancement to leave all file metadata in IDBFS and put file content in SF (if available). But we would need to eliminate syncfs and the MEMFS copy of FS data and that would mean needing an async FS layer.

@kripken
Copy link
Member

kripken commented Sep 16, 2021

Right now the system calls in emscripten proxy to the main ui thread (with pthreads)... that means these calls are already async...

I think that's not accurate. The filesystem operations are proxied synchronously to the main thread. We wait until they finish, so that they complete (write to memory, etc.), and also we need the return value from them.

Any proxying is not good, but synchronous is especially bad, and proxying to main thread is also very bad. The new version should avoid all that in most cases. Most of the code and metadata (directory structure) could be in wasm, and so it's accessible from all threads. Actual file data might be present in JS on a particular thread, and proxied to, but we can cache those in memory as well.

@hcldan
Copy link

hcldan commented Sep 16, 2021

We wait until they finish

I understand the distinction, but I call this async because we could simply wait for a callback instead of waiting for it to finish (cb sets the result).. in any case, I agree... we don't want to be proxying if we can avoid it, and we don't want to have to run this stuff on the main thread.

idbfs, even in a worker, is going to need some sort of async api, right?
Maybe the fs_next layer should be natively async, and "we wait until they finish" for everything. It would make the code much more portable, no?

@ethanalee
Copy link
Collaborator Author

As far as requirements go:

  • Do not keep FS data in memory, delegate all IO to mounted filesystems. (eliminate syncfs)

    • FS metadata, might be fine, as I think SF doesn't have a good way to implement file locking, readonly/readwrite, and opening a file more than once.

I actually think that syncfs will need to remain in some form. This is for persistence usage. Think of this as writing to disk in a traditional file system. Persisting data would be on-demand, as requested by the user-level application code.

@kripken
Copy link
Member

kripken commented Sep 16, 2021

@hcldan

idbfs, even in a worker, is going to need some sort of async api, right?

Oh, I see, sorry for misunderstanding you. Yes, an async operation can happen in the thread that is proxied to. (It won't always be async, if the API it calls is sync, but in general async is necessary to support.)

@hcldan
Copy link

hcldan commented Sep 16, 2021

This is for persistence usage.

Why? When you can write directly to a file with StorageFoundation? This only really only seems necessary because FS is sync and not all mount drivers can be.

Think of this as writing to disk in a traditional file system. Persisting data would be on-demand, as requested by the user-level application code.

I think writing the to the filesystem in a traditional file system is a good indication that the data should be persisted. Keeping a copy of the FS in memory is absolutely horrible for our use cases. It's a non-starter.

@ethanalee
Copy link
Collaborator Author

We wait until they finish

I understand the distinction, but I call this async because we could simply wait for a callback instead of waiting for it to finish (cb sets the result).. in any case, I agree... we don't want to be proxying if we can avoid it, and we don't want to have to run this stuff on the main thread.

idbfs, even in a worker, is going to need some sort of async api, right?
Maybe the fs_next layer should be natively async, and "we wait until they finish" for everything. It would make the code much more portable, no?

FS operations for an async backend would need data to be present to be synchronously read. For example, you want all the data for video game graphics available to be read synchronously instead of waiting asynchronously. However, persisting to a file system could be async since we need it does not matter to the user when this action will complete.

@hcldan
Copy link

hcldan commented Sep 16, 2021

FS operations for an async backend would need data to be present to be synchronously read.

Not necessarily, we just discussed an approach currently in use with PTHREAD support to proxy (async) a sync api and wait for the result.

@hcldan
Copy link

hcldan commented Sep 16, 2021

it does not matter to the user when this action will complete.

Google's behavior for PWA experience is pretty clear in what they expect (the browser could go away at any time).
I would hope that there isn't any magic behind when data is persisted and the write call returns.

@hcldan
Copy link

hcldan commented Sep 17, 2021

@ethanalee

For example, you want all the data for video game graphics available to be read synchronously instead of waiting asynchronously.

Agreed. But that is a problem in the domain of that application, not the FS, I would argue.
I don't object to this technique, I just object to it implemented here. It causes some rather large problems for applications that deal with a lot of stored data.

@sbc100
Copy link
Collaborator

sbc100 commented Sep 17, 2021

@ethanalee

For example, you want all the data for video game graphics available to be read synchronously instead of waiting asynchronously.

Agreed. But that is a problem in the domain of that application, not the FS, I would argue.

Its not the application that require synchronous reads.. its the low level POSIX API that that we are implementing: read and fread and inherently synchronous. Inventing a low-level async API file API is not useful for the majority of software in the wild. So IMHO, we have fairly hard requirement for synchronous, blocking, works-on-the-main-thread, API, just like the one we have today.

We do have ASYNCIFY that can come to rescue, but I'm not sure we can ask all filesystem users to use ASYNCIFY.

@hcldan
Copy link

hcldan commented Sep 17, 2021

@sbc100 I understand that it's the low level posix api. I meant that if a game application needs FS data in memory, that's up to the game application (which would likely be multithreaded anyway).

I just want to avoid holding this data in memory... and I'd like to support the existing apis that have no sync options (idbfs) but if everyone here is unwilling to make them use asyncify for it, I guess we are at an impasse. Hopefully with the FS in a worker thread instead we can proxy calls synchronously and make good use of SF apis and that will have much better performance characteristics and leave the idbfs driver seldom used. If we decide to keep syncfs, I really hope that we can relegate it to only be needed for fs apis that are async.

BTW, the indexeddb spec does say the sync apis could be brought back if needed. This sounds like it might qualify as a good reason to bring them back to the workers.

@tlively
Copy link
Member

tlively commented Sep 18, 2021

Sync/Async

This is the matrix of the possible sync/async combinations:

sync user API async user API
sync backend make sync backend calls then return make sync backend calls and immediately schedule user callback
async backend * make async backend calls and schedule the user callback for when they resolve

The only really interesting one is when the user API is synchronous and the backend is asynchronous. Here are the available options in that situation:

  1. Use Asyncify to emulate a synchronous call from the calling code's point of view.
  2. Proxy the asynchronous backend work to another thread and block until it finishes.
  3. Fake it as well as possible on a single thread:
    • For reads: eagerly have all data available in memory to be synchronously read.
    • For writes/flushes: Make the asynchronous backend call and speculatively return success before the operation actually completes.

Am I missing any useful options?

Each of these options has a downside: (1) is too slow for many applications, (2) doesn't work on the main browser thread, and (3) may lead to surprising data loss.

So IMO the best solution is to offer an asynchronous API for users who can rewrite their applications to use it, and also support all these other options where possible so that users can pick the one that works best for their use case otherwise.

Memory residency

Except in the case of option (3) above, I agree that we should make it possible for users to configure the system to not keep entire files in memory by default, but rather have a configurable in-memory (either Wasm memory or JS heap) cache of hot data backed by one of the persistent Web storage backends (whether or not the data is actually persisted across page loads).

@curiousdannii
Copy link
Contributor

Would making it easier to feed async data into stdin be in scope for the FS rewrite?

@ethanalee
Copy link
Collaborator Author

ethanalee commented Sep 18, 2021

Except in the case of option (3) above, I agree that we should make it possible for users to configure the system to not keep entire files in memory by default, but rather have a configurable in-memory (either Wasm memory or JS heap) cache of hot data backed by one of the persistent Web storage backends (whether or not the data is actually persisted across page loads).

I think this is an important distinction to make. syncfs is only used when we are persisting to cold storage, which will 1) be optional for users and 2) happen relatively infrequently (for example when a user wants to save a game level perhaps). I also agree that we should have a backing store, which could be in-memory. This would not invoke syncfs and would instead by "automatic" in the sense that segments of files could be lazily loaded when required

@ethanalee
Copy link
Collaborator Author

Would making it easier to feed async data into stdin be in scope for the FS rewrite?

I think this requirement is already covered by the above specifications, no?

@curiousdannii
Copy link
Contributor

curiousdannii commented Sep 18, 2021

Just checking that stdio is included, not just other FSs.

@jozefchutka
Copy link

Hi everyone, just wanted to check on the current status...

According to https://emscripten.org/docs/api_reference/Filesystem-API.html#new-file-system-wasmfs this still seems as "Work in Progress". Is that the case? Or can be already used with some limitations? Are there any instructions how to start, or too early?

@kripken
Copy link
Member

kripken commented Feb 25, 2022

Yes, still a work in progress. You can track landing PRs as they have [WasmFS] in their titles, and there is a project. Overall many basic things work, but a lot remains to do like miscellaneous syscalls (PR open for pipes as we speak) and backends (Node backend is being built right now).

If you don't need specific backends that don't exist yet (most of them, except for Memory files and JS File), and use only common syscalls, then things might work for you. Building with -sWASMFS would be the way to test. But I wouldn't recommend it for production yet.

One use case that might already work well enough, though, is simple file usage with pthreads. That will avoid most of the old FS's proxying overhead. Testing and finding bugs there would be helpful. And in general, contributions are welcome as always - the project linked to before has open issues for things.

@jozefchutka
Copy link

Hi @kripken I am about to invest some tome to testing this next days. My use case is as following:

  1. user picks directory with window.showDirectoryPicker()
  2. this FileSystemDirectoryHandle reference is passed into a worker (hopefully it can be passed) where module is initialized
  3. module.FS mounts WASMFS
  4. main is called (ffmpeg in my case) where there are various read/write operations executed directly on host filesystem.

Sounds achievable?

Before the related docs exists, how do I do the actual mounting? Previously I used WORKERFS:

const files:File[] = ...
module.FS.mount(module.WORKERFS, {files}, "mydir");

Now with WASMFS, whats the actual api? do I just pass my FileSystemDirectoryHandle reference as the second argument or something completely different?

module.FS.mount(module.WASMFS, handle, "mydir");

@tlively
Copy link
Member

tlively commented Mar 10, 2022

@jozefchutka, I would take a look at the WasmFS node backend to get a sense for how a new WasmFS backend would be structured. Note that the backend interface is not stable and is actively changing, so you'll have to keep up with that for now. Eventually we will have a stable interface and proper documentation on bringing up a new backend as well as utilities to make it easier to do so.

Relevant files:

@curiousdannii
Copy link
Contributor

curiousdannii commented Mar 10, 2022

I wonder if mapping between external file paths and internal-to-Emscripten file paths could be a common concern?

Previously I mounted one FS folder to a custom FS, but I didn't mount the whole FS as I still wanted to use MemFS for /tmp (and also I don't know if mounting the whole FS works or not.) But the custom FS could be given any arbitrary path, so I had to add my own filename mapping system so that any external path could be converted into a single-level filename within the mounted folder. This was a little bit hacky, and I'm not certain my system worked perfectly. Perhaps it would have been better to just append the external path to the mounted folder? But would that involve creating all the intermediate folders on the fly? I really don't know what the best solution is.

If anyone else had the same need, and if Emscripten could offer its own utility, then I'd definitely want to switch to it.

@tlively
Copy link
Member

tlively commented Mar 10, 2022

I wonder if mapping between external file paths and internal-to-Emscripten file paths could be a common concern?

Yep, I definitely expect that to be common. I'm currently building out the Node backend (linked above) that follows this pattern, but once that works I expect to extract the structure into a common utility that could be used for accessing remote files via other APIs like WASI or XHR.

@goldwaving
Copy link
Contributor

I'm working on an audio editing app that frequently works with 50MB+ files, but have been hitting all kinds of memory limits on mobile devices. If I understand WasmFS correctly, if I implement a idbfs_backend module and link with -sWASMFS, that will replace the in-memory file system and c/c++ functions will use the new file system (avoiding memory). Is that correct?

@tlively
Copy link
Member

tlively commented Mar 23, 2022

Yes, that's the idea. However I expect that you wouldn't need to write the backend yourself - we'll implement an indexed db backend as we bring WasmFS up to feature parity with the current file system.

@goldwaving
Copy link
Contributor

goldwaving commented Mar 29, 2022

I've created an IDBFS backend and am running tests, but noticed some odd behaviour. Given the following test code:

backend_t backend = wasmfs_create_idbfs_backend();

// Create a new backend file under root.
int fd = wasmfs_create_file("/testfile", 0777, backend);

This creates an IDBFS file, but appears to use the memory backend to insertChild instead of the IDBFS backend. The same problem occurs with wasmfs_create_directory. The design always seems to assume that the root (parent) backend is the memory backend and so new files and directories are not inserted in the correct backend. I have to create a '/first' directory and then a 'first/second' directory for the correct backend to be used. However the IDBFS backend only has the 'second' directory and not the 'first' directory.

@kripken
Copy link
Member

kripken commented Mar 29, 2022

We should add an option to change the root directory, which would allow setting its backend - just no one has gotten around to it, I think. I'm also not sure offhand what that API should look like. (A PR would be welcome.)

This creates an IDBFS file, but appears to use the memory backend to insertChild instead of the IDBFS backend.

I'm not sure I understand, but that sounds like the expected behavior? The root is a MemoryBackend (until we add an API to allow other stuff) so we call insertChild on that. But that file can have a different backend, and file operations on that file will use its backend. Maybe I missed something in your example though?

@goldwaving
Copy link
Contributor

If IDBFS children are inserted into the memory backend file system, then when the app closes and the memory file system is gone, all the IDBFS children are lost because they were never inserted in a permanent storage directory. There is no way to recover the files (their filenames, inodes, blocks, etc.). This also causes lost inodes and blocks in permanent storage.

I'm not sure I understand the overall vision of the new file system. Are all the backends supposed to exists off of the memory backend as directories, such as /node, /idbfs, etc.? Or are they intended to be completely independent? Or both?

Inserting a child from one backend into a different backend probably should not be allowed. Imagine inserting a memory backend file into an IDBFS backend and restarting the app. The file data will not exist, but the directory entry will. Backends may have to be treated like separate devices and files copied between them.

@tlively
Copy link
Member

tlively commented Mar 29, 2022

Right now the expectation is that arbitrary backends will be able to be mounted under the root in-memory backend, but otherwise backends should not be mounted on each other. It is also expected that applications will mount their persistent backends on startup to "discover" the previously-written data rather than having them be mounted automatically.

I am contemplating a change that would allow backends to be arbitrarily mounted onto each other, although information about those mount points would be in-memory only and would not be visible to the backend implementations. That avoids the dangling directory entry problem you mentioned, but means that applications would be responsible for re-mounting all backends on every run.

Given that capability, we could probably provide a weak function definition for creating the root backend. Users would be able to provide alternative definitions that return a different backend to be used as the root backend. That would be simpler and more flexible than providing e.g. new command line options for choosing the root backend.

@tlively
Copy link
Member

tlively commented Nov 23, 2022

Closing this in favor of tracking progress on https://github.com/orgs/emscripten-core/projects/1.

@tlively tlively closed this as completed Nov 23, 2022
@westurner
Copy link

FWIW, from isomorphic-git:

If you're using isomorphic-git in the browser, you'll need something that emulates the fs API. The easiest to setup and most performant library is LightningFS which is written and maintained by the same author and is part of the isomorphic-git suite. If LightningFS doesn't meet your requirements, isomorphic-git should also work with BrowserFS and Filer. Instead of isomorphic-git/http/node this time import isomorphic-git/http/web

Closing this in favor of tracking progress on: https://github.com/orgs/emscripten-core/projects/1

@patrickcorrigan
Copy link

Hi @kripken I am about to invest some tome to testing this next days. My use case is as following:

  1. user picks directory with window.showDirectoryPicker()
  2. this FileSystemDirectoryHandle reference is passed into a worker (hopefully it can be passed) where module is initialized
  3. module.FS mounts WASMFS
  4. main is called (ffmpeg in my case) where there are various read/write operations executed directly on host filesystem.

Sounds achievable?

Before the related docs exists, how do I do the actual mounting? Previously I used WORKERFS:

const files:File[] = ...
module.FS.mount(module.WORKERFS, {files}, "mydir");

Now with WASMFS, whats the actual api? do I just pass my FileSystemDirectoryHandle reference as the second argument or something completely different?

module.FS.mount(module.WASMFS, handle, "mydir");

Did you find out a WASMFS solution for this? :)

@Jaifroid
Copy link

@patrickcorrigan, the JS API for wasmFS is marked still as "todo", so I'm not sure it's useable yet. See #15976. However, over at Kiwix PWA, we use WORKERFS in combination with the File System API, and it works very well even with files of 97GB. However, we don't pass a whole directory to the FS, we only pass one file at a time, which seems different from your use case.

@patrickcorrigan
Copy link

patrickcorrigan commented Oct 26, 2023

Thanks @Jaifroid. One file is all I need too :) I have been using the MEMFS so far and works great but run into issues with files of about 300 - 400 MB on Safari iOS. I was looking into reading directly from the OPFS and said I'd try WASMFS. I will use workerfs. Thank you for letting me know. I really appreciate it :) I spend about an hour yesterday messing around with WASMFS and exploring the FS object it exposed but could not figure out a way to do it.

@Jaifroid
Copy link

Yes, MEMFS simulates a file system in memory, so you'll hit memory issues as soon as you go over the device's memory allocation. Since we work routinely with ZIM archives larger than 1GB, and often much larger, that was a non-starter for us.

You have to be able to receive messages, and the file as a transferrable object via postMessage in the Worker JS that loads the WASM, so it means you probably have to build your WASM with that code, using prejs_ and postjs_. See ours here (in the source Repo for our ASM/WASM): https://github.com/openzim/javascript-libzim/blob/main/prejs_file_api.js. It's a slight extra complication to use a Worker, but it also means you don't have to worry about async file operations.

@jozefchutka
Copy link

@patrickcorrigan for the time being I am still using WORKERFS and sending Blobs and Files to and from Worker (using stdout to build output Blob). The benefit is undefined file size, the downside is its readonly.

@patrickcorrigan
Copy link

patrickcorrigan commented Oct 26, 2023

@jozefchutka Read only is all I need. The only problem is my app draws to canvas and I can't get it to run with proxy_to_worker or proxy_to_pthread. So I think I'm stuck for the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests