Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue of toString() function #30

Closed
twoeths opened this issue Jul 29, 2022 · 7 comments
Closed

Performance issue of toString() function #30

twoeths opened this issue Jul 29, 2022 · 7 comments

Comments

@twoeths
Copy link

twoeths commented Jul 29, 2022

Motivation
This causes performance issue in protons, see ipfs/protons#51 (comment)

Description
Benchmark shows that toString() takes > 10x times to convert to a utf8 string, compared to protobuf

✔ string decode                                                    1.831166e+7 ops/s    54.61000 ns/op        -    5138306 runs   30.1 s
✔ string decode using protobuf                                     2.288330e+8 ops/s    4.370000 ns/op        -   41186303 runs   30.1 s

reference implementation: https://github.com/protobufjs/protobuf.js/blob/6f0806ddef408647fbaed049dbd8929ad9f5c10c/lib/utf8/index.js#L40-L62

@twoeths
Copy link
Author

twoeths commented Jul 29, 2022

the code I run test is actually not the above, but an old version of it which does not use string concatenation

/**
 * Reads UTF8 bytes as a string.
 * @param {Uint8Array} buffer Source buffer
 * @param {number} start Source start
 * @param {number} end Source end
 * @returns {string} String read
 */
utf8.read = function utf8_read(buffer, start, end) {
    var len = end - start;
    if (len < 1)
        return "";
    var parts = null,
        chunk = [],
        i = 0, // char offset
        t;     // temporary
    while (start < end) {
        t = buffer[start++];
        if (t < 128)
            chunk[i++] = t;
        else if (t > 191 && t < 224)
            chunk[i++] = (t & 31) << 6 | buffer[start++] & 63;
        else if (t > 239 && t < 365) {
            t = ((t & 7) << 18 | (buffer[start++] & 63) << 12 | (buffer[start++] & 63) << 6 | buffer[start++] & 63) - 0x10000;
            chunk[i++] = 0xD800 + (t >> 10);
            chunk[i++] = 0xDC00 + (t & 1023);
        } else
            chunk[i++] = (t & 15) << 12 | (buffer[start++] & 63) << 6 | buffer[start++] & 63;
        if (i > 8191) {
            (parts || (parts = [])).push(String.fromCharCode.apply(String, chunk));
            i = 0;
        }
    }
    if (parts) {
        if (i)
            parts.push(String.fromCharCode.apply(String, chunk.slice(0, i)));
        return parts.join("");
    }
    return String.fromCharCode.apply(String, chunk.slice(0, i));
};

@achingbrain
Copy link
Owner

achingbrain commented Aug 2, 2022

Internally this uses plain-js TextDecoder. Possibly related: nodejs/node#39879

Another approach may be to create a node Buffer as a view on the buffer memory and call .toString('utf8') on it? As I understand it, performance of TextDecoder is good in browsers but poor in node (at the moment) so we may only need to improve performance in node.

Can you please share your benchmark as runnable code (a gist will do) so we can experiment?

@achingbrain
Copy link
Owner

I've added some benchmarks here, the results are.. eye opening: #32

@achingbrain
Copy link
Owner

Please can you try again with uint8arrays@3.1.0, toString/fromString should now be faster in node.

If you're still seeing a big discrepancy can you please share your benchmark code?

@achingbrain
Copy link
Owner

Closing as the performance issue has been resolved, please reopen if you are still seeing it.

This module is now only a little bit slower than using node Buffers, likely because of the compatibility requirements in #39

% node benchmarks/from-string.js
Uint8Arrays.fromString x 2,011,648 ops/sec ±0.54% (86 runs sampled)
TextEncoder x 1,368,021 ops/sec ±2.22% (81 runs sampled)
Buffer.from x 2,200,960 ops/sec ±4.02% (88 runs sampled)
Fastest is Buffer.from
% node benchmarks/to-string.js
Uint8Arrays.toString x 3,138,022 ops/sec ±0.52% (93 runs sampled)
TextDecoder x 425,713 ops/sec ±3.29% (79 runs sampled)
utf8ReadFromCharCode x 2,065,098 ops/sec ±1.19% (92 runs sampled)
utf8ReadConcat x 1,162,056 ops/sec ±0.68% (92 runs sampled)
Buffer.toString x 3,509,065 ops/sec ±0.79% (87 runs sampled)
Fastest is Buffer.toString

@ThaUnknown
Copy link

ThaUnknown commented Dec 7, 2022

hey @achingbrain I frankenstein'ed this code from a bunch of different places and it's actually just barely slower than Buffer.toString [on node] and over x2 faster than TextDecoder

// https://github.com/feross/buffer/blob/master/index.js#L1035
const MAX_ARGUMENTS_LENGTH = 0x10000

function decodeCodePointsArray (codePoints) {
  const len = codePoints.length
  if (len <= MAX_ARGUMENTS_LENGTH) {
    return String.fromCharCode(...codePoints)
  }

  let res = ''
  let i = 0
  while (i < len) {
    res += String.fromCharCode(...codePoints.slice(i, i += MAX_ARGUMENTS_LENGTH))
  }
  return res
}
// https://github.com/achingbrain/uint8arrays/issues/30#issuecomment-1199120924
// https://github.com/anonrig/undici/blob/494d27782ad2df68e2e61603914aa0a2dce0f5c8/lib/fetch/util.js#L872

function utf8Spread (buffer, start = 0, end = buffer.byteLength) {
  if (end - start < 1) {
    return ''
  }
  let arr = new Array(end - start)
  let j = 0
  for (let i = start; i < end;) {
    const t = buffer[i++]
    if (t <= 0x7F) {
      arr[j++] = t
    } else if (t >= 0xC0 && t < 0xE0) {
      arr[j++] = (t & 0x1F) << 6 | buffer[i++] & 0x3F
    } else if (t >= 0xE0 && t < 0xF0) {
      arr[j++] = (t & 0xF) << 12 | (buffer[i++] & 0x3F) << 6 | buffer[i++] & 0x3F
    } else if (t >= 0xF0) {
      const t2 = ((t & 7) << 18 | (buffer[i++] & 0x3F) << 12 | (buffer[i++] & 0x3F) << 6 | buffer[i++] & 0x3F) - 0x10000
      arr[j++] = 0xD800 + (t2 >> 10)
      arr[j++] = 0xDC00 + (t2 & 0x3FF)
    }
  }

  arr = arr.slice(0, j + 1)

  return decodeCodePointsArray(arr)
}

tested on chromium, seems like recently it has gotten some optimisations to textdecoder, anyways:

OldTextDecoder x 402,921 ops/sec ±0.65% (93 runs sampled)
NewTextDecoder x 1,326,990 ops/sec ±3.74% (59 runs sampled)
utf8Spread x 2,983,076 ops/sec ±0.92% (90 runs sampled)

EDIT: i used your benchmark code which constructed a new text decoder every time, which was the bottleneck

notably on node 18+ async is stupidly fast on big strings:

await new Response(DATA).text()

@jimmywarting
Copy link

@achingbrain Maybe you also want to re-using the same TextDecoder in your testcase?

const res = new TextDecoder().decode(DATA)

- const res = new TextDecoder().decode(DATA) 
+ const res = textDecoder.decode(DATA) 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants