File Priority hints on read #56

noell · 2019-05-21T06:09:56Z

App might send requests for lots of files, needing resolving of the file handles (or underlying URL) to get their fileEntries, ...

Some files are more important than others. How does this new new file API propose to deal with that? For example, if I were using fetch(), I can add whatever headers to priority hint. How would one do this with this new file system API?

pwnall · 2019-05-24T10:34:11Z

Thanks for the suggestion!

This API's mission is to facilitate exchanging data with native applications. I think that performance should be a secondary concern. For high-performance I/O, I recommend using sandboxed APIs like Cache Storage or IndexedDB. The data stored using these APIs isn't observable by other native applications, so it's easier to produce high-performance implementations.

noell · 2019-05-24T13:14:30Z

Yes, we use indexDB already, but that's not the problem. In our app, the ChromeOS file manager, we send FIleSystem requests for file metadata for the content of the users' current directory, so we can display it (the file name, type, last modifed time, etc, for each file).

While those requests are happening, the user is free to decide to navigate to some new directory, one that might contain files for which we need to fetch the entire file content and transform that content for display in the app UI.

For example, the new directory might contain RAW image files, and we now need to issue new FileSystem requests to fetch the content of the files, create thumbnail images of the RAW "image/tiff" files, cache the results in an indexDB (for the obvious reasons), and present the thumbnail in displayed UI of the users' new directory.

Whatever work has been requested via FileSystem API prior to this point, is now way less important compared to fetching the RAW image content and transforming it for display, because it is both costly in terms of time and UI-affecting. If existing requests get serviced before these UI-affecting requests, and delay them, well oh dear - the file system results we want right now get delayed behind pre-existing, lower-priority work.

Result: the thumbnails get drawn with noticable lag to the UI [1]. Looks like a tragedy of the commons as all FileSystem requests appear to have equal priority.

At least with HTTP fetch(), I can add a priority hints, via headers. But I can not do the same with FileSystem API (provide a hint). Maybe we should just use fetch(), but it's a lot slower (in chrome) relative to FileSystem fetching, for the same content [2]. fetch() is much smoother I note - provides more consistent fetch times - perhaps because of HTTP/2. FileSystem has much faster fetch times in comparision, however, they are also less smooth, less consistent (aka "jittery").

The new FileSystem API is implemented in terms of async/await, rather than Promises. If it were Promises, I might be able to maintain a queue of outstanding FileSystem request Promises, and call Promise.reject() on the lower-prority ones, to make way for the high-prority ones. Not sure that's possible with await [edit: oh I see it is possible - don't call await async foo(); instead call async foo() and enqueue the returned Promise].

[1] https://bugs.chromium.org/p/chromium/issues/detail?id=904630#c21
[2] https://bugs.chromium.org/p/chromium/issues/detail?id=904630#c16

pwnall · 2019-05-24T18:59:48Z

@noell Thank you for explaining your use case! I now see how these concerns can play out in any IDE that has to handle large projects.

taralx · 2019-05-26T21:13:25Z

That raises the question for me: should this API support AbortSignals like fetch does?

noell · 2019-05-27T03:26:59Z

Indeed, interesting question. The old FileSystem API was modelled on XHR -- that's all there was a the time. Maybe the new FileSystem API is modelled that way too (dunno), but perhaps it should be looking to be a more fetch-like API. And that make's me wonder why it's not just an addendum to the fetch-specs.

pwnall · 2019-05-27T21:00:41Z

Can you please elaborate on how you'd see this integrated with fetch?

When I read your comment above, I assumed you'd use fetch with file system URLs. However, we think that file system URLs aren't a good idea for the new API, and won't be bringing them over from the old API. The reason behind our thinking is that URLs don't mesh with the new permission model. This new permission model is bound to be more complex, because the new API mediates access to the user's data, whereas the old API only granted access to a per-origin sandbox.

taralx · 2019-05-27T21:24:32Z

I feel like the nature of the resource locator isn't really germane -- fetch API uses URLs, but it could easily support opaque FileEntry objects or something like that. But the mechanics of "read this data" are not different between network fetch and local fetch -- you still want prioritization, cancellation, and streaming.

mkruisselbrink · 2019-05-28T16:46:44Z

You can read blobs using streams (https://w3c.github.io/FileAPI/#stream-method-algo), and streams provide back pressure and cancellation, so it seems like you should be able to implement your own prioritization on top of that? It's not clear to me what more functionality you would need.

pwnall · 2019-05-31T08:06:52Z

Maybe it'd make sense to make directory listing take in an AbortSignal.

mkruisselbrink · 2019-05-31T16:48:11Z

Directory listing as currently spec'ed uses an async iterator, and async iterators do have a "return" method you can use to abort the iteration (or if you're doing a for await (foo of bar) you can just break to cancel the iteration). So not sure what benefit having an AbortSignal would give you?

guest271314 · 2019-06-02T18:19:12Z

@pwnall

When I read your comment above, I assumed you'd use fetch with file system URLs. However, we think that file system URLs aren't a good idea for the new API, and won't be bringing them over from the old API.

Note, it is not possible to use a "filesystem" URL scheme with fetch() web-platform-tests/wpt#15525.

guest271314 · 2019-06-02T18:32:11Z

Using .map() and Promise.all() without async/await will process requests in parallel and result in the output array being in the same order as input array, even if the entry at input[input.length -1] is completed first.

The process the entries in a specific order, with the ability to start a procedure which begins with the goal of processing in input -> output order, with the ability to return, break, throw, restart, etc., you can compose the file entry/directories and read code as an async iterable which can be suspended or "interrupted" and restarted at any time.

E.g. (https://gist.github.com/guest271314/74d3a2aa765330163eecc46e16a52acd),

while (!done) {
  // await next(readFile)
}

or 

await next(readFile, readDirectory /* "recusively", or a non-terminating procedure that happens to refer to itself */, readFile);
await next(readDirectory, ...readDirectories)

done = true;

await done();

guest271314 · 2019-06-02T20:19:07Z

@pwnall XMLHttpRequest() can still be utilized to fetch a resource using "filesystem" URL scheme (this fact should not be interpreted as a PR to "fix" that functionality).

pwnall · 2019-06-02T21:19:32Z

filesystem: URLs are generated by the File Directories and System API which is separate from this proposal. Let's keep the issues in this repository focused on the Native File System API.

As a Chrome developer, I welcome your feedback on Chrome's implementation of the File Directories and System API in the Chrome bug tracker. Our long-term plans there are to deprecate and remove the API. We are not currently pursuing that deprecation due to competing priorities. I don't know if we can remove XHR support for filesystem: URLs without breaking the Web, but we're definitely not opposed to the idea.

guest271314 · 2019-06-03T04:15:41Z

@pwnall Was only replying because you mentioned filesystem URL scheme and fetch() where AFAIK it has not ever been possible to successfully fetch a filesystem URL using fetch().

I don't know if we can remove XHR support for filesystem: URLs without breaking the Web, but we're definitely not opposed to the idea.

Do not remove support for fetching a resource using filesystem URL scheme and XMLHttpRequest(). That would be going backwards where at this proposed API the goal is unsandboxed access to the user filesystem. Why remove support for one and add support for the same functionality? The idea of deprecating Filesystem API solves nothing. Let it be. If you can get this API functioning, and functioning securely and in some way differently than using webkitRequestFilesystem, hooray!

pwnall · 2019-06-04T07:27:19Z

@guest271314 Thank you for the kind words! Let's keep this repository focused on the Native File System API, and keep this issue focused on @noell's use case.

mkruisselbrink · 2020-04-16T21:58:38Z

I'm not sure there is much to do here. As said directory iteration is already abortable (in the spec). For reading actual file data you can just stop reading data and/or explicitly cancel the ReadableStream (if using streams).

I guess one thing that might be problematic is if you're trying to "stat" a large number of files by calling filehandle.getFile() on them all at the same time, and decide before all of those resolve that you're no longer interested in them. You could of course work around that by limiting the number of outstanding promises yourself, but it might be hard to decide how many to leave outstanding to still have the overall operation complete as soon as possible?

So there does seem some potential improvement in developer ergonomics in being able to pass an AbortSignal to filehandle.getFile(), that also seems like something that isn't blocking, and can be added in the future if there is a clear need for it.

jimmywarting · 2022-05-28T21:37:57Z

I just learned today that calling webkitRequestFileSystem(TEMPORARY, ...args) will give you the same bucket as navigator.storage.getDirectory() dose... 😳

As a result you will be able to use filesystem:http://localhost/temporary/:path urls
You can add this to img, video, audio, workers urls and have it work flawlessly with files you have added via navigator.storage.getDirectory

You can also even make xhr request to this urls

url = 'filesystem:http://localhost:4444/temporary/arkiv.zip'
xhr = new XMLHttpRequest()
xhr.responseType = 'blob'
xhr.open('GET', url)
xhr.send()

weirdly you can't not use the fetch api... 😞

Is it possible to also get into the persistent directory as well?
Like: navigator.storage.getDirectory({ persistent: true }) would give the same bucket as webkitRequestFileSystem(PERSISTENT, ...args)

a-sully · 2022-05-31T22:44:30Z

Is it possible to also get into the persistent directory as well?

We're in the process of deprecating the PERSISTENT keyword. I would recommend staying away from it. See this doc

This was referenced Jun 1, 2019

High-performance read operations (especially for large numbers of files) #57

Closed

fetch() a local file denoland/deno#2150

Closed

mkruisselbrink mentioned this issue Apr 10, 2020

getEntries() iteration order in the face of changes? #127

Closed

mkruisselbrink added this to the Future milestone Apr 16, 2020

dslee414 closed this as not planned Won't fix, can't repro, duplicate, stale Jan 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File Priority hints on read #56

File Priority hints on read #56

noell commented May 21, 2019 •

edited

Loading

pwnall commented May 24, 2019

noell commented May 24, 2019 •

edited

Loading

pwnall commented May 24, 2019

taralx commented May 26, 2019

noell commented May 27, 2019

pwnall commented May 27, 2019

taralx commented May 27, 2019

mkruisselbrink commented May 28, 2019

pwnall commented May 31, 2019

mkruisselbrink commented May 31, 2019

guest271314 commented Jun 2, 2019

guest271314 commented Jun 2, 2019

guest271314 commented Jun 2, 2019

pwnall commented Jun 2, 2019

guest271314 commented Jun 3, 2019

pwnall commented Jun 4, 2019

mkruisselbrink commented Apr 16, 2020

jimmywarting commented May 28, 2022 •

edited

Loading

a-sully commented May 31, 2022

File Priority hints on read #56

File Priority hints on read #56

Comments

noell commented May 21, 2019 • edited Loading

pwnall commented May 24, 2019

noell commented May 24, 2019 • edited Loading

pwnall commented May 24, 2019

taralx commented May 26, 2019

noell commented May 27, 2019

pwnall commented May 27, 2019

taralx commented May 27, 2019

mkruisselbrink commented May 28, 2019

pwnall commented May 31, 2019

mkruisselbrink commented May 31, 2019

guest271314 commented Jun 2, 2019

guest271314 commented Jun 2, 2019

guest271314 commented Jun 2, 2019

pwnall commented Jun 2, 2019

guest271314 commented Jun 3, 2019

pwnall commented Jun 4, 2019

mkruisselbrink commented Apr 16, 2020

jimmywarting commented May 28, 2022 • edited Loading

a-sully commented May 31, 2022

noell commented May 21, 2019 •

edited

Loading

noell commented May 24, 2019 •

edited

Loading

jimmywarting commented May 28, 2022 •

edited

Loading