Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to get a ReadStream from a FileSystemFileHandle #157

Closed
srgraham opened this issue Feb 25, 2020 · 13 comments
Closed

Ability to get a ReadStream from a FileSystemFileHandle #157

srgraham opened this issue Feb 25, 2020 · 13 comments
Milestone

Comments

@srgraham
Copy link

I want to open a stream to a file handle, so that as more content shows up in the file, the new content streams to the browser. Currently it looks like the only way to do this is refetch FileSystemFileHandle.getFile() and check if the size changed inside a loop, then reading everything in full again if so. I don't think this will be adequate for big files.

The use case for this is a tail -f look-a-like for the browser with various processing and display of the log contents as it comes in.

@jimmywarting
Copy link

jimmywarting commented Mar 10, 2020

feels like a duplex read/write stream is in order?
Have been thinking about it also. just like the seek method on writable stream i wish it did exist on some readable stream as well.

for stuff where you have to read part of the file. again and again. but then i think Byob mode have to be implemented first in browsers...

one problem doe is the temporary file while you are both reading and writing at the same time. maybe something that inPlace=true was meant to solve? #67

@mkruisselbrink
Copy link
Contributor

You don't have to read everything in full after the file changes (since File does support random-access reading by slicing etc). But the polling solution is still not ideal since if the file changes in between calling getFile() and you actually trying to read from it the read operation fails. So if the file updates frequently you might never be able to read from it.

So yeah, perhaps we should have an alternative way of reading from files that doesn't go through File/Blob objects, and thus doesn't have the limitations of those that they effectively represent snapshots of the file. If tail -f is the only use case for such an API, exposing it as a never-ending ReadableStream probably does make sense. There won't be any seeking in that case, but that is probably fine for that particular use case?

Are there other use cases where the snapshotting/invalidating-on-change behavior of Blob/File objects is problematic?

@mkruisselbrink mkruisselbrink added this to the V2 milestone Apr 9, 2020
@tinchoz49
Copy link

I wish we could have something like basic random access operations over a file without snapshots. Like fs.read and fs.write in node.js. Having to refetch the file after every write it gets us some performance issues in https://github.com/random-access-storage/random-access-chrome-file

@tomayac
Copy link
Contributor

tomayac commented Aug 21, 2020

I wish we could have something like basic random access operations over a file without snapshots. Like fs.read and fs.write in node.js. Having to refetch the file after every write it gets us some performance issues in https://github.com/random-access-storage/random-access-chrome-file

@tinchoz49 I looked into this, and turns out you are not actually using the new Native File System API, but the predecessor API. You want to look at navigator.storage.getDirectory() instead, which currently in Chrome 86.0.4239.0 is still implemented as self.getOriginPrivateDirectory() (see #217 for the change). The switch to the new API should be painless. Please let us know if this solves the issue.

@tinchoz49
Copy link

Hi @tomayac! Yes, to be honest I thought that the new API was using the same internal base of the old one but it was an assumption. I'm going to work on switch to the new API to check how it goes. Thank you for the quick response on this!

@ddumont
Copy link

ddumont commented Aug 21, 2020

I can confirm the new filesystem api is still really slow

@xeonfusion
Copy link

@mkruisselbrink posting this as suggested by @tomayac:

I have tried using either FileReader API or fetch method (used with blob URL created by createObjectURL()) to read a (.csv) file from local storage, that is being updated frequently by another process (a vital signs text stream being appended to the CSV file at a set time interval). The FileHandle was obtained from the standard file picker.

Both methods report errors similar to above issue about not being able to read the file due to net::ERR_UPLOAD_FILE_CHANGED or file has been modified error.

Reference to these reports: https://stackoverflow.com/questions/61916331/re-uploading-a-file-with-ajax-after-it-was-changed-causes-neterr-upload-file-c, https://bugs.chromium.org/p/chromium/issues/detail?id=1084880&q=ERR_UPLOAD_FILE_CHANGED&can=2 etc

My code is at https://github.com/xeonfusion/VSChart/blob/master/src/anaesthchart.jsx#L376

The use of File.slice().arrayBuffer() Promise doesn't work either and can't access the file being modified.

There should be a way to read file updates efficiently rather than have just snapshots of the blob, such as the desired functionality of web apps, reading log files that are being updated in real time. The blob will be read only if the file picker is used to select the file again after another process releases the file in use. Do let us know of any workarounds or alternative approaches.

@jimmywarting
Copy link

jimmywarting commented Feb 26, 2021

@xeonfusion There is one other way that you can get a file handle from which you can retrieve a new File and call .arrayBuffer() without relying on chromes experimental file system access with requestPermission calls and that works in other browser right now. But <input type=file> is not one of them.

I'm talking about drag and drop
The idea is to get a file entry first from where you later can call the .file() function

ondrop = evt => evt.dataTransfer.items[0].webkitGetAsEntry().file()

function drop(event) {
  event.stopPropagation();
  event.preventDefault();
    
  // get the file as an fileEntry (aka file handle)
  const fileEntry = event.dataTransfer.items[0].webkitGetAsEntry()
  let lastModificationTime = new Date(0)
  
  async function read (file) {
    // use the new async read method on blobs.
    console.log(await file.text())
  }
  
  function compare (meta) {
    if (meta.modificationTime > lastModificationTime) {
      lastModificationTime = meta.modificationTime
      fileEntry.file(read)
    }
  }
  
  setInterval(fileEntry.getMetadata.bind(fileEntry, compare), 1000)
}

more Demo/Code/explanation on my SO answer about monitoring file changes

@xeonfusion
Copy link

xeonfusion commented Feb 27, 2021

@jimmywarting @tomayac I can confirm that the drag and drop method of obtaining a File handle, does work with a file being updated by another process continuously, without the file being modified error (or need to select the file handle again)!

The fileEntry is obtained similar to the drop event implementation described by you above:

  event.stopPropagation();
  event.preventDefault();
    
  // get the file as an fileEntry (aka file handle)
  const fileEntry = event.dataTransfer.items[0].webkitGetAsEntry()

Then a callback function can be used to read the data from the FileSystemFileEntry File() object as described here: https://developer.mozilla.org/en-US/docs/Web/API/FileSystemFileEntry/file

function ReadFile(fileEntry, successCallback, errorCallback) {
        fileEntry.file(function(file) {
          let reader = new FileReader();
      
          reader.onload = function() {
            successCallback(reader.result);
          };
      
          reader.onerror = function() {
            errorCallback(reader.error);
          }
      
          reader.readAsText(file);
        }, errorCallback);
      }

If drag and drop can work, then the input file dialog FileHandle behavior should be fixed to achieve the same functionality. Suggested implementation from input file event:

// get the file as an fileEntry (aka file handle)
  const fileEntry = event.target.files[0].webkitGetAsEntry();

@jimmywarting
Copy link

If drag and drop can work, then the input file dialog FileHandle behavior should be fixed to achieve the same functionality.

Now we have gotten a little bit of track from the main topic and starts to be irrelevant.

But i just want to say that there is one other behavior change that you can do to <input type=file> and that is to selecting a directory. But this have some performances disadvantages since it walks over all folder and append each file into input.files[] as a flat list and makes the browser hang for a bit. and you don't get any folders only the prefixed relative path.
unfortunately this isn't a file or directory Entry either

I do agree with you that it could have been nice to get a all files and directory as a Entry instead. but i guess it is not as easy to just add support for entries on the <input type=files>. it plays another roll that it also have to work together with the serialization of FormData, xhr, fetch, <form>... This may also mean that it has to be sync since it also have to add a content-length header? Supporting Entries on <input type=files> also means we have to update support on other parts of the web that use file inputs as well

what would you get if file input was async and you could get the files as an Entry instead?

<form>
  <input type="file" name="upload" onchange="foo(event)" x-as-entry>
  <script>
    window.foo(evt) {
      new FormData(evt.target.form).get('upload') === ???
    }
  </script>
</form>

it would be cool idea doe if FormData could have some async behavior added to it so it could upload a FileSystemDirectoryHandle to the server. but for now the best way to upload them is either to zip everything and upload one file or manually craft a multipart FormData ReadableStream with streamable uploads

const rs = new ReadableStream({
  pull(ctrl) {
    // enqueue one entry/handle
  }
})

const req = new Request(url, { 
  body: rs, 
  method: 'post',
  headers: { 'content-type': 'application/formdata-multipart' } 
})

fetch(req)

But not many browser supports streamable uploads today :(

Maybe the file system access should invent a new form element that can map to showDirectoryPicker or showOpenFilePicker?

@ftreesmilo
Copy link

similar to #72

@mkruisselbrink
Copy link
Contributor

Not sure how related this is to #72. Some of the work on access handles (#310) might be able to address some of these us cases, although that is currently planned to be limited to files in the origin private file system. Not sure if we can come up with a solution where we could use a similar API for read-only access to arbitrary files, but there might be something we can come up with...

@a-sully
Copy link
Collaborator

a-sully commented Feb 11, 2022

Closing this issue since this seems mostly addressed by the new Access Handle surface within the Origin Private File System #344. If you have a more specific feature request which isn't addressed by Access Handles, feel free to open a new issue :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants