Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Priority hints on read #56

Closed
noell opened this issue May 21, 2019 · 19 comments
Closed

File Priority hints on read #56

noell opened this issue May 21, 2019 · 19 comments
Milestone

Comments

@noell
Copy link

noell commented May 21, 2019

App might send requests for lots of files, needing resolving of the file handles (or underlying URL) to get their fileEntries, ...

Some files are more important than others. How does this new new file API propose to deal with that? For example, if I were using fetch(), I can add whatever headers to priority hint. How would one do this with this new file system API?

@pwnall
Copy link
Collaborator

pwnall commented May 24, 2019

Thanks for the suggestion!

This API's mission is to facilitate exchanging data with native applications. I think that performance should be a secondary concern. For high-performance I/O, I recommend using sandboxed APIs like Cache Storage or IndexedDB. The data stored using these APIs isn't observable by other native applications, so it's easier to produce high-performance implementations.

@noell
Copy link
Author

noell commented May 24, 2019

Yes, we use indexDB already, but that's not the problem. In our app, the ChromeOS file manager, we send FIleSystem requests for file metadata for the content of the users' current directory, so we can display it (the file name, type, last modifed time, etc, for each file).

While those requests are happening, the user is free to decide to navigate to some new directory, one that might contain files for which we need to fetch the entire file content and transform that content for display in the app UI.

For example, the new directory might contain RAW image files, and we now need to issue new FileSystem requests to fetch the content of the files, create thumbnail images of the RAW "image/tiff" files, cache the results in an indexDB (for the obvious reasons), and present the thumbnail in displayed UI of the users' new directory.

Whatever work has been requested via FileSystem API prior to this point, is now way less important compared to fetching the RAW image content and transforming it for display, because it is both costly in terms of time and UI-affecting. If existing requests get serviced before these UI-affecting requests, and delay them, well oh dear - the file system results we want right now get delayed behind pre-existing, lower-priority work.

Result: the thumbnails get drawn with noticable lag to the UI [1]. Looks like a tragedy of the commons as all FileSystem requests appear to have equal priority.

At least with HTTP fetch(), I can add a priority hints, via headers. But I can not do the same with FileSystem API (provide a hint). Maybe we should just use fetch(), but it's a lot slower (in chrome) relative to FileSystem fetching, for the same content [2]. fetch() is much smoother I note - provides more consistent fetch times - perhaps because of HTTP/2. FileSystem has much faster fetch times in comparision, however, they are also less smooth, less consistent (aka "jittery").

The new FileSystem API is implemented in terms of async/await, rather than Promises. If it were Promises, I might be able to maintain a queue of outstanding FileSystem request Promises, and call Promise.reject() on the lower-prority ones, to make way for the high-prority ones. Not sure that's possible with await [edit: oh I see it is possible - don't call await async foo(); instead call async foo() and enqueue the returned Promise].

[1] https://bugs.chromium.org/p/chromium/issues/detail?id=904630#c21
[2] https://bugs.chromium.org/p/chromium/issues/detail?id=904630#c16

@pwnall
Copy link
Collaborator

pwnall commented May 24, 2019

@noell Thank you for explaining your use case! I now see how these concerns can play out in any IDE that has to handle large projects.

@taralx
Copy link

taralx commented May 26, 2019

That raises the question for me: should this API support AbortSignals like fetch does?

@noell
Copy link
Author

noell commented May 27, 2019

Indeed, interesting question. The old FileSystem API was modelled on XHR -- that's all there was a the time. Maybe the new FileSystem API is modelled that way too (dunno), but perhaps it should be looking to be a more fetch-like API. And that make's me wonder why it's not just an addendum to the fetch-specs.

@pwnall
Copy link
Collaborator

pwnall commented May 27, 2019

Can you please elaborate on how you'd see this integrated with fetch?

When I read your comment above, I assumed you'd use fetch with file system URLs. However, we think that file system URLs aren't a good idea for the new API, and won't be bringing them over from the old API. The reason behind our thinking is that URLs don't mesh with the new permission model. This new permission model is bound to be more complex, because the new API mediates access to the user's data, whereas the old API only granted access to a per-origin sandbox.

@taralx
Copy link

taralx commented May 27, 2019

I feel like the nature of the resource locator isn't really germane -- fetch API uses URLs, but it could easily support opaque FileEntry objects or something like that. But the mechanics of "read this data" are not different between network fetch and local fetch -- you still want prioritization, cancellation, and streaming.

@mkruisselbrink
Copy link
Contributor

You can read blobs using streams (https://w3c.github.io/FileAPI/#stream-method-algo), and streams provide back pressure and cancellation, so it seems like you should be able to implement your own prioritization on top of that? It's not clear to me what more functionality you would need.

@pwnall
Copy link
Collaborator

pwnall commented May 31, 2019

Maybe it'd make sense to make directory listing take in an AbortSignal.

@mkruisselbrink
Copy link
Contributor

Directory listing as currently spec'ed uses an async iterator, and async iterators do have a "return" method you can use to abort the iteration (or if you're doing a for await (foo of bar) you can just break to cancel the iteration). So not sure what benefit having an AbortSignal would give you?

@guest271314
Copy link

@pwnall

When I read your comment above, I assumed you'd use fetch with file system URLs. However, we think that file system URLs aren't a good idea for the new API, and won't be bringing them over from the old API.

Note, it is not possible to use a "filesystem" URL scheme with fetch() web-platform-tests/wpt#15525.

@guest271314
Copy link

Using .map() and Promise.all() without async/await will process requests in parallel and result in the output array being in the same order as input array, even if the entry at input[input.length -1] is completed first.

The process the entries in a specific order, with the ability to start a procedure which begins with the goal of processing in input -> output order, with the ability to return, break, throw, restart, etc., you can compose the file entry/directories and read code as an async iterable which can be suspended or "interrupted" and restarted at any time.

E.g. (https://gist.github.com/guest271314/74d3a2aa765330163eecc46e16a52acd),

while (!done) {
  // await next(readFile)
}

or 

await next(readFile, readDirectory /* "recusively", or a non-terminating procedure that happens to refer to itself */, readFile);
await next(readDirectory, ...readDirectories)

done = true;

await done();

@guest271314
Copy link

@pwnall XMLHttpRequest() can still be utilized to fetch a resource using "filesystem" URL scheme (this fact should not be interpreted as a PR to "fix" that functionality).

@pwnall
Copy link
Collaborator

pwnall commented Jun 2, 2019

filesystem: URLs are generated by the File Directories and System API which is separate from this proposal. Let's keep the issues in this repository focused on the Native File System API.

As a Chrome developer, I welcome your feedback on Chrome's implementation of the File Directories and System API in the Chrome bug tracker. Our long-term plans there are to deprecate and remove the API. We are not currently pursuing that deprecation due to competing priorities. I don't know if we can remove XHR support for filesystem: URLs without breaking the Web, but we're definitely not opposed to the idea.

@guest271314
Copy link

@pwnall Was only replying because you mentioned filesystem URL scheme and fetch() where AFAIK it has not ever been possible to successfully fetch a filesystem URL using fetch().

I don't know if we can remove XHR support for filesystem: URLs without breaking the Web, but we're definitely not opposed to the idea.

Do not remove support for fetching a resource using filesystem URL scheme and XMLHttpRequest(). That would be going backwards where at this proposed API the goal is unsandboxed access to the user filesystem. Why remove support for one and add support for the same functionality? The idea of deprecating Filesystem API solves nothing. Let it be. If you can get this API functioning, and functioning securely and in some way differently than using webkitRequestFilesystem, hooray!

@pwnall
Copy link
Collaborator

pwnall commented Jun 4, 2019

@guest271314 Thank you for the kind words! Let's keep this repository focused on the Native File System API, and keep this issue focused on @noell's use case.

@mkruisselbrink
Copy link
Contributor

I'm not sure there is much to do here. As said directory iteration is already abortable (in the spec). For reading actual file data you can just stop reading data and/or explicitly cancel the ReadableStream (if using streams).

I guess one thing that might be problematic is if you're trying to "stat" a large number of files by calling filehandle.getFile() on them all at the same time, and decide before all of those resolve that you're no longer interested in them. You could of course work around that by limiting the number of outstanding promises yourself, but it might be hard to decide how many to leave outstanding to still have the overall operation complete as soon as possible?

So there does seem some potential improvement in developer ergonomics in being able to pass an AbortSignal to filehandle.getFile(), that also seems like something that isn't blocking, and can be added in the future if there is a clear need for it.

@mkruisselbrink mkruisselbrink added this to the Future milestone Apr 16, 2020
@jimmywarting
Copy link

jimmywarting commented May 28, 2022

I just learned today that calling webkitRequestFileSystem(TEMPORARY, ...args) will give you the same bucket as navigator.storage.getDirectory() dose... 😳

As a result you will be able to use filesystem:http://localhost/temporary/:path urls
You can add this to img, video, audio, workers urls and have it work flawlessly with files you have added via navigator.storage.getDirectory

You can also even make xhr request to this urls

url = 'filesystem:http://localhost:4444/temporary/arkiv.zip'
xhr = new XMLHttpRequest()
xhr.responseType = 'blob'
xhr.open('GET', url)
xhr.send()

weirdly you can't not use the fetch api... 😞


Is it possible to also get into the persistent directory as well?
Like: navigator.storage.getDirectory({ persistent: true }) would give the same bucket as webkitRequestFileSystem(PERSISTENT, ...args)

@a-sully
Copy link
Collaborator

a-sully commented May 31, 2022

Is it possible to also get into the persistent directory as well?

We're in the process of deprecating the PERSISTENT keyword. I would recommend staying away from it. See this doc

@dslee414 dslee414 closed this as not planned Won't fix, can't repro, duplicate, stale Jan 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants