Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TextDecoderStream and TextEncoderStream #5648

Closed
jimmywarting opened this issue Sep 17, 2023 · 40 comments · Fixed by #13115 or #13214
Closed

Add TextDecoderStream and TextEncoderStream #5648

jimmywarting opened this issue Sep 17, 2023 · 40 comments · Fixed by #13115 or #13214
Assignees
Labels
enhancement New feature or request web:encoding wintercg Web-interoperable Runtimes Community Group compatiblity

Comments

@jimmywarting
Copy link

jimmywarting commented Sep 17, 2023

TextDecoderStream and TextEncoderStream are missing

@danny-avila
Copy link

danny-avila commented Oct 15, 2023

const reader = response.body.pipeThrough(new TextDecoderStream()).getReader();
                                                     ^
ReferenceError: Can't find variable: TextDecoderStream

bun version 1.0.6+969da088f5db3258a803ec186012e30f992829b4

@SukkaW
Copy link
Contributor

SukkaW commented Nov 23, 2023

Workaround: copy the following ponyfill into a .ts file:

// Copyright 2016 Google Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//     http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// Polyfill for TextEncoderStream and TextDecoderStream

// Modified by Sukka (https://skk.moe) to increase compatibility and performance with Bun.

export class PolyfillTextDecoderStream extends TransformStream<Uint8Array, string> {
  readonly encoding: string;
  readonly fatal: boolean;
  readonly ignoreBOM: boolean;

  constructor(
    encoding: Encoding = 'utf-8',
    {
      fatal = false,
      ignoreBOM = false,
    }: ConstructorParameters<typeof TextDecoder>[1] = {},
  ) {
    const decoder = new TextDecoder(encoding, { fatal, ignoreBOM });
    super({
      transform(chunk: Uint8Array, controller: TransformStreamDefaultController<string>) {
        const decoded = decoder.decode(chunk, { stream: true });
        if (decoded.length > 0) {
          controller.enqueue(decoded);
        }
      },
      flush(controller: TransformStreamDefaultController<string>) {
        // If {fatal: false} is in options (the default), then the final call to
        // decode() can produce extra output (usually the unicode replacement
        // character 0xFFFD). When fatal is true, this call is just used for its
        // side-effect of throwing a TypeError exception if the input is
        // incomplete.
        const output = decoder.decode();
        if (output.length > 0) {
          controller.enqueue(output);
        }
      }
    });

    this.encoding = encoding;
    this.fatal = fatal;
    this.ignoreBOM = ignoreBOM;
  }
}

Then import { PolyfillTextDecoderStream } from 'path/to/where/you/save/the/polyfill.ts'.

@hgezim
Copy link

hgezim commented May 31, 2024

Bummer. Can't use @google/generative-ai due to this.

@zaiste
Copy link

zaiste commented Jun 1, 2024

It's not possible to use Vercel AI SDK because of this issue

@octet-stream
Copy link

octet-stream commented Jun 1, 2024

Here are polyfills you can use based off the Node.js' implementation while you waiting for Bun to catch up:

/**
 * TextEncoderStream polyfill based on Node.js' implementation https://github.com/nodejs/node/blob/3f3226c8e363a5f06c1e6a37abd59b6b8c1923f1/lib/internal/webstreams/encoding.js#L38-L119 (MIT License)
 */
export class TextEncoderStream {
  #pendingHighSurrogate: string | null = null

  #handle = new TextEncoder()

  #transform = new TransformStream<string, Uint8Array>({
    transform: (chunk, controller) => {
      // https://encoding.spec.whatwg.org/#encode-and-enqueue-a-chunk
      chunk = String(chunk)

      let finalChunk = ""
      for (let i = 0; i < chunk.length; i++) {
        const item = chunk[i]
        const codeUnit = item.charCodeAt(0)
        if (this.#pendingHighSurrogate !== null) {
          const highSurrogate = this.#pendingHighSurrogate

          this.#pendingHighSurrogate = null
          if (0xdc00 <= codeUnit && codeUnit <= 0xdfff) {
            finalChunk += highSurrogate + item
            continue
          }

          finalChunk += "\uFFFD"
        }

        if (0xd800 <= codeUnit && codeUnit <= 0xdbff) {
          this.#pendingHighSurrogate = item
          continue
        }

        if (0xdc00 <= codeUnit && codeUnit <= 0xdfff) {
          finalChunk += "\uFFFD"
          continue
        }

        finalChunk += item
      }

      if (finalChunk) {
        controller.enqueue(this.#handle.encode(finalChunk))
      }
    },

    flush: (controller) => {
      // https://encoding.spec.whatwg.org/#encode-and-flush
      if (this.#pendingHighSurrogate !== null) {
        controller.enqueue(new Uint8Array([0xef, 0xbf, 0xbd]))
      }
    },
  });

  get encoding() {
    return this.#handle.encoding
  }

  get readable() {
    return this.#transform.readable
  }

  get writable() {
    return this.#transform.writable
  }

  get [Symbol.toStringTag]() {
    return 'TextEncoderStream'
  }
}

/**
 * TextDecoderStream polyfill based on Node.js' implementation https://github.com/nodejs/node/blob/3f3226c8e363a5f06c1e6a37abd59b6b8c1923f1/lib/internal/webstreams/encoding.js#L121-L200 (MIT License)
 */
export class TextDecoderStream {
  #handle: TextDecoder

  #transform = new TransformStream({
    transform: (chunk, controller) => {
      const value = this.#handle.decode(chunk, {stream: true})

      if (value) {
        controller.enqueue(value)
      }
    },
    flush: controller => {
      const value = this.#handle.decode()
      if (value) {
        controller.enqueue(value)
      }

      controller.terminate()
    }
  })

  constructor(encoding = "utf-8", options: TextDecoderOptions = {}) {
    this.#handle = new TextDecoder(encoding, options)
  }

  get encoding() {
    return this.#handle.encoding
  }

  get fatal() {
    return this.#handle.fatal
  }

  get ignoreBOM() {
    return this.#handle.ignoreBOM
  }

  get readable() {
    return this.#transform.readable
  }

  get writable() {
    return this.#transform.writable
  }

  get [Symbol.toStringTag]() {
    return "TextDecoderStream"
  }
}

Both basically just TS port I use in my projects with Bun.

@hgezim
Copy link

hgezim commented Jun 1, 2024 via email

@octet-stream
Copy link

I mean, you can use it in your app, can't you? It probably relies on globalThis, so:

// Add those polyfills to globalThis before you import `@google/generative-ai`
globalThis.TextEncoderStream ||= TextEncoderStream
globalThis.TextDecoderStream ||= TextDecoderStream

But for library this might not be a good option.

@SukkaW
Copy link
Contributor

SukkaW commented Jun 2, 2024

Here are polyfills you can use based off the Node.js' implementation while you waiting for Bun to catch up:

You can simplify the implementation by simply using class PolyfillTextDecoderStream extends TransformStream<Uint8Array, string> and super() (same for TextEncoderStream). See #5648 (comment)

@octet-stream
Copy link

If you read the spec, you'll see TextDecoderStream and TextEncoderStream are not subclasses of TransformStream.

@SukkaW
Copy link
Contributor

SukkaW commented Jun 2, 2024

If you read the spec, you'll see TextDecoderStream and TextEncoderStream are not subclasses of TransformStream.

IMO, the spec only defines what a correct implementation should look like, and it demonstrates that "Correct TextEncoderStream and TextDecoderStream could be implemented like these", but the runtime doesn't have to implement them in the exact same way. That's to say, the spec doesn't enforce what you must do to implement this.

IMHO, as long as the implementation exposes the required APIs and fields, and the behaviors are the same, then the implementation is spec-compliant.

Also, take a look at the spec:

image

image

The spec requires TextEncoderStream and TextDecoderStream to have all the fields from GenericTransformStream, so extends just works.

@SukkaW
Copy link
Contributor

SukkaW commented Jun 2, 2024

If you read the spec, you'll see TextDecoderStream and TextEncoderStream are not subclasses of TransformStream.

Also, my implementation is based on Google, Inc.'s work.

And it is not only Google does that, QuickJS also uses extends: https://github.com/rsenn/qjs-modules/blob/7e83ad1402fed2681ce189d4cfc2b55386a5bcc5/lib/streams.js#L211

@ivan-kleshnin
Copy link

Tried to use the polyfill proposed by @octet-stream above with TRPC client:

Getting this error:

error: Invalid response or stream interrupted
      at new StreamInterruptedError (/Users/username/Sandboxes/mytrpc/node_modules/@trpc/server/dist/unstable-core-do-not-import/stream/stream.mjs:233:9)
      at closeOrAbort (/Users/username/Sandboxes/mytrpc/node_modules/@trpc/server/dist/unstable-core-do-not-import/stream/stream.mjs:421:23)
      at promiseInvokeOrNoopMethodNoCatch (:1:21)
      at promiseInvokeOrNoopMethod (:1:21)
      at writableStreamDefaultControllerProcessClose (:1:21)
      at writableStreamDefaultControllerAdvanceQueueIfNeeded (:1:21)
      at writableStreamDefaultControllerClose (:1:21)
      at writableStreamClose (:1:21)

@darklight9811
Copy link

@Jarred-Sumner what is the priority on this? Just to have an idea to when expect for this to be released.

@lobomfz
Copy link

lobomfz commented Jul 10, 2024

@trpc/server 11.0.0-rc.361 onwards is broken on bun because of this. This makes bun unusable for trpc backends with react-query, since it relies on trpc 11+. Workaround is using version 11.0.0-rc.359

EDIT: Only with httpBatchStreamLink

@OreQr
Copy link

OreQr commented Jul 12, 2024

@trpc/server 11.0.0-rc.361 onwards is broken on bun because of this. This makes bun unusable for trpc backends with react-query, since it relies on trpc 11+. Workaround is using version 11.0.0-rc.359

EDIT: Only with httpBatchStreamLink

You can use this polyfill to make it work comment

@patsimok
Copy link

patsimok commented Jul 13, 2024

https://chng.it/n848nhZ89w

petition for adding the streams

@dtinth
Copy link

dtinth commented Jul 21, 2024

The @google/generative-ai npm package also uses TextDecoderStream for its streaming generation. Right now it doesn’t work in Bun, producing the following output:

637 |  * GenerateContentResponse.
638 |  *
639 |  * @param response - Response from a fetch call
640 |  */
641 | function processStream(response) {
642 |     const inputStream = response.body.pipeThrough(new TextDecoderStream("utf8", { fatal: true }));
                                                            ^
ReferenceError: Can't find variable: TextDecoderStream
      at processStream (node_modules/@google/generative-ai/dist/index.mjs:642:55)

Bun v1.1.12 (macOS arm64)

(As a workaround I used tsx to run the script in Node.js instead of Bun for now.)

@PatrickJS
Copy link

i signed the petition

@jamespacileo
Copy link

jamespacileo commented Jul 25, 2024

if you had this issue with tests you need to create a setup.ts/js file

// polyfills here...


if (typeof globalThis.TextDecoderStream === 'undefined') {
  // @ts-ignore
  globalThis.TextDecoderStream = TextDecoderStream;
}

// Ensure the polyfill is applied
const ensureTextDecoderStream = () => {
  if (typeof globalThis.TextDecoderStream === 'undefined') {
    throw new Error('TextDecoderStream is not defined after polyfill');
  }
};

ensureTextDecoderStream();

export { };

bun.config.json

{
  "test": {
    "preload": [
      "./setup.ts"
    ],
    "include": [
      "src/**/*.test.ts"
    ]
  }
}

@ananay
Copy link

ananay commented Aug 1, 2024

Jarred has confirmed that if the petition gets 100 signatures, they'll implement this.
https://x.com/jarredsumner/status/1818739728914722989

Please sign!!! Here is the link:
https://www.change.org/p/urge-jarred-sumner-to-implement-textencoderstream-and-textdecoderstream-in-bun

@HironTez
Copy link

HironTez commented Aug 1, 2024

We reached 100 signatures 🎉

@danny-avila
Copy link

Will @Jarred-Sumner deliver?

@Jarred-Sumner
Copy link
Collaborator

Will @Jarred-Sumner deliver?

no, but @dylan-conway will :)

#13115

@n2k3
Copy link

n2k3 commented Aug 6, 2024

Thanks to @dylan-conway for implementing this!

Am I correct in that, once #13115 would be merged, bun can now be used for running the Next.js dev server (using this command: bun --bun run dev)? If so, that means that the callout in the bun guide for Next.js can be removed. If not, what are other Node APIs that Next.js relies on that bun is not (fully) supporting yet?

@notKamui
Copy link

notKamui commented Aug 6, 2024

anyone knows how to get the polyfill work with Next.js inside a Bun Docker build? I've imported the polyfill patched to the globalThis but still got error: Attempt to export a nullable value for "TextDecoderStream" error.

I am having the same issue

@birkskyum
Copy link
Collaborator

For anyone wondering why reopened, see #13151

@farezv
Copy link

farezv commented Sep 27, 2024

@dylan-conway @Jarred-Sumner I'm still seeing TextDecoderStream issues in my Next.js app, running via nixpacks created image. When I run my Next.js app outside of the container, everything is fine.

Nix even added bun 1.1.29 support so wondering if others have complained about this.
https://github.com/NixOS/nixpkgs/blob/8b085394e9121d66fcb31db141681d23b5490cc3/pkgs/development/web/bun/default.nix#L15

$ next start
  ▲ Next.js 14.2.7
  - Local:        http://localhost:3000

 ✓ Starting...
 ✓ Ready in 472ms
# ...
{var __webpack_modules__=
# ...
# deleted logs for brevity but wanted to show webpack_modules reference.

error: Attempt to export a nullable value for "TextDecoderStream"
      at defineProperties (/app/node_modules/next/dist/compiled/edge-runtime/index.js:1:711500)
      at addPrimitives (/app/node_modules/next/dist/compiled/edge-runtime/index.js:1:710245)
      at extend (/app/node_modules/next/dist/compiled/edge-runtime/index.js:1:705028)
      at new VM (/app/node_modules/next/dist/compiled/edge-runtime/index.js:1:712369)
      at new EdgeVM (/app/node_modules/next/dist/compiled/edge-runtime/index.js:1:704958)
      at /app/node_modules/next/dist/server/web/sandbox/context.js:223:21

@wottpal
Copy link

wottpal commented Oct 23, 2024

Same issue as @farezv. Did you find a fix or is there another issue here tracking this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request web:encoding wintercg Web-interoperable Runtimes Community Group compatiblity
Projects
None yet