Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming data to/from IndexedDB #419

Open
dumbmatter opened this issue Apr 19, 2024 · 1 comment
Open

Streaming data to/from IndexedDB #419

dumbmatter opened this issue Apr 19, 2024 · 1 comment
Labels
TPAC2024 Topic for discussion at TPAC 2024

Comments

@dumbmatter
Copy link

Now that the Streams API is widely supported, would it make sense to have some built-in IndexedDB API for streaming data to/from IndexedDB?

The problem now is that it is somewhat difficult and inefficient to write such functionality on your own. For example, if you want to create a ReadableStream that outputs all of the data in a giant object store, you can't just naively iterate over a cursor in ReadableStream.pull because the transaction will automatically close at some point. So you wind up kind of fighting against the stream trying to only read part of the data into memory at once, and IndexedDB closing a transaction when it's no longer active. Something like this:

const makeReadableStream = (db, store) => {
  let prevKey;

  return new ReadableStream({
    async pull(controller) {
      const range = prevKey !== undefined
        ? IDBKeyRange.lowerBound(prevKey, true)
        : undefined;

      const MIN_BATCH_SIZE = 100;
      let batchCount = 0;

      let cursor = await db.transaction(store).store.openCursor(range);
      while (cursor) {
        controller.enqueue(`${JSON.stringify(cursor.value)}\n`);
        prevKey = cursor.key
        batchCount += 1;

        if (controller.desiredSize > 0 || batchCount < MIN_BATCH_SIZE) {
          cursor = await cursor.continue();
        } else {
          break;
        }
      }

      console.log(`Done batch of ${batchCount} object`);

      if (!cursor) {
        // Actually done with this store, not just paused
        console.log("Completely done");
        controller.close();
      }
    },
  }, {
    highWaterMark: 100,
  });
};

In addition to that code being a little complicated to write, it's also probably slower than it needs to be due to creating many transactions over the course of a large stream.

I wrote a blog post about this a few years ago and if I search I still can't find anyone else talking about doing stuff like this, but I do get a couple people finding that article in Google every day and every now and again someone emails me about it, so I'm not literally the only person interested in this. Although I admit it's probably a niche use case. I do have hundreds of users every day exporting large amounts of data from IndexedDB in my video games, and that uses code similar to what I wrote in that blog post.

What would be better is maybe an API equivalent to getAll - a method on IDBObjectStore and IDBIndex that takes an IDBKeyRange and returns a stream of all matching records. And then maybe also an equivalent API for writing data to an object store.

@asutherland
Copy link
Collaborator

xref #34 on explicit transaction lifetime control.

@SteveBeckerMSFT SteveBeckerMSFT added the TPAC2024 Topic for discussion at TPAC 2024 label Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TPAC2024 Topic for discussion at TPAC 2024
Projects
None yet
Development

No branches or pull requests

3 participants