buffer: improve fill & normalizeEncoding performance #18790

BridgeAR · 2018-02-15T03:13:32Z

This improves the performance of Buffer#fill and of normalizeEncoding. The latter focuses on the common cases as can be seen in the benchmarks.

I made the Buffer.isEncoding() implementation stricter again after it was loosened in #7207. It will not return true for an empty string anymore.
normalizeEncoding is now also stricter and it returns undefined for false, NaN and 0.
undefined, null and '' are still "valid" utf8 encodings.
The Buffer#fill implementation will now also throw an OOB error in case end is a negative value. This makes it consistent with start and it helps to identify issues since before it would just been ignored instead.
Buffer#fill will throw the errors in JS from now on in case the OOB is detected in c++ and those errors contain the proper error code from now on.
Buffer#fill will from now on also accept null as valid utf8 encoding in case a string is provided. That was not the case before but we do accept it in other places and that makes it more consistent.

Buffer#fill performance

                                                                          confidence improvement accuracy (*)   (**)   (***)
 buffers/buffer-fill.js n=20000 size=10 type='fill("")'                          ***     16.25 %       ±5.16% ±6.89%  ±9.01%
 buffers/buffer-fill.js n=20000 size=10 type='fill("t", "utf8")'                 ***     17.49 %       ±3.59% ±4.78%  ±6.23%
 buffers/buffer-fill.js n=20000 size=10 type='fill("t", 0, "utf8")'              ***     14.82 %       ±4.17% ±5.55%  ±7.23%
 buffers/buffer-fill.js n=20000 size=10 type='fill("t", 0)'                      ***     21.59 %       ±4.88% ±6.50%  ±8.47%
 buffers/buffer-fill.js n=20000 size=10 type='fill("t")'                         ***     23.16 %       ±3.88% ±5.17%  ±6.73%
 buffers/buffer-fill.js n=20000 size=10 type='fill("test")'                      ***     12.85 %       ±3.91% ±5.24%  ±6.91%
 buffers/buffer-fill.js n=20000 size=10 type='fill(0)'                           ***     22.39 %       ±2.48% ±3.30%  ±4.30%
 buffers/buffer-fill.js n=20000 size=10 type='fill(100)'                         ***     24.41 %       ±4.52% ±6.02%  ±7.85%
 buffers/buffer-fill.js n=20000 size=10 type='fill(400)'                         ***     22.00 %       ±2.27% ±3.03%  ±3.95%
 buffers/buffer-fill.js n=20000 size=10 type='fill(Buffer.alloc(1), 0)'          ***      8.60 %       ±3.85% ±5.17%  ±6.84%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("")'                        ***     22.84 %       ±6.16% ±8.20% ±10.68%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("t", "utf8")'               ***      8.74 %       ±4.89% ±6.57%  ±8.67%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("t", 0, "utf8")'            ***     10.69 %       ±4.94% ±6.60%  ±8.63%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("t", 0)'                    ***     14.17 %       ±4.13% ±5.51%  ±7.20%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("t")'                       ***     21.25 %       ±3.83% ±5.10%  ±6.65%
 buffers/buffer-fill.js n=20000 size=5000 type='fill("test")'                    ***     16.50 %       ±2.03% ±2.70%  ±3.52%
 buffers/buffer-fill.js n=20000 size=5000 type='fill(0)'                         ***     29.73 %       ±4.40% ±5.89%  ±7.72%
 buffers/buffer-fill.js n=20000 size=5000 type='fill(100)'                       ***     15.48 %       ±2.54% ±3.38%  ±4.40%
 buffers/buffer-fill.js n=20000 size=5000 type='fill(400)'                       ***     16.19 %       ±2.62% ±3.49%  ±4.55%
 buffers/buffer-fill.js n=20000 size=5000 type='fill(Buffer.alloc(1), 0)'         **      4.36 %       ±2.79% ±3.74%  ±4.90%

Normalize encoding performance

                                                                    confidence improvement accuracy (*)    (**)   (***)
 buffers/buffer-normalize-encoding.js n=1000000 encoding='ascii'           ***     43.44 %       ±5.20% ±7.00%  ±9.28%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='ASCII'           ***    136.89 %       ±1.27% ±1.69%  ±2.20%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='base64'          ***     66.18 %       ±2.95% ±3.93%  ±5.13%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='BASE64'          ***    150.70 %       ±2.20% ±2.94%  ±3.88%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='binary'           **      4.80 %       ±3.38% ±4.50%  ±5.86%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='BINARY'          ***     81.52 %       ±2.15% ±2.87%  ±3.73%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='hex'             ***     57.24 %       ±6.14% ±8.23% ±10.84%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='HEX'             ***    209.63 %       ±3.27% ±4.35%  ±5.67%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='latin1'          ***      7.58 %       ±1.59% ±2.12%  ±2.80%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='LATIN1'          ***     99.93 %       ±2.00% ±2.69%  ±3.55%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='ucs-2'           ***    -13.48 %       ±1.73% ±2.32%  ±3.06%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UCS-2'           ***     70.29 %       ±1.09% ±1.46%  ±1.91%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='ucs2'            ***    -21.51 %       ±3.98% ±5.30%  ±6.92%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UCS2'            ***    101.15 %       ±4.36% ±5.84%  ±7.68%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='utf-16le'        ***     12.98 %       ±4.21% ±5.60%  ±7.30%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UTF-16LE'        ***    159.48 %       ±2.72% ±3.65%  ±4.79%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='utf-8'           ***      7.24 %       ±3.28% ±4.41%  ±5.84%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UTF-8'           ***    102.20 %       ±2.58% ±3.45%  ±4.53%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='utf16le'         ***     10.11 %       ±1.42% ±1.90%  ±2.48%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UTF16LE'         ***    148.52 %       ±3.42% ±4.57%  ±6.01%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='utf8'            ***     14.78 %       ±3.00% ±3.99%  ±5.21%
 buffers/buffer-normalize-encoding.js n=1000000 encoding='UTF8'            ***    153.65 %       ±3.99% ±5.31%  ±6.91%

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
tests and/or benchmarks are included
documentation is changed or added
commit message follows commit guidelines

Affected core subsystem(s)

buffer

jasnell

LGTM with a good CITGM run.

vsemozhetbyt · 2018-02-15T06:17:37Z

doc/api/buffer.md

@@ -1228,6 +1228,9 @@ console.log(buf1.equals(buf3));
 <!-- YAML
 added: v0.5.0
 changes:
+  - version: REPLACEME
+    pr-url: https://github.com/nodejs/node/pull/REPLACEME
+    description: Negative `end` values throw an `ERR_INDEX_OF_OUT_BOUNDS` error.


ERR_INDEX_OUT_OF_RANGE?

benjamingr

Nice work!

I think it would be interesting to add fill with larger values to the benchmark.

Actual changes LGTM

benjamingr · 2018-02-15T14:03:05Z

lib/internal/util.js

+}
+
+function slowCases(enc) {
+  switch (enc.length) {


I'm surprised this improves performance noticeably.

BridgeAR · 2018-02-15T14:18:33Z

@benjamingr the performance gain for Buffer#fill goes down the bigger the buffer is. The break even point should be at about 25kb. Above that the filling will be the main time consumer.

benjamingr · 2018-02-15T14:22:26Z

@BridgeAR right, but since we're adding a benchmark that will run when this code changes for a while, I think there is value in adding a larger buffer for the test case - I suspect allocating large'ish buffers is a pretty common use case. Even if there won't be a big difference here.

BridgeAR · 2018-02-15T14:28:25Z

                                                                           confidence improvement accuracy (*)    (**)   (***)
 buffers/buffer-fill.js n=20000 size=16384 type='fill("")'                          *      7.97 %       ±7.19%  ±9.57% ±12.45%
 buffers/buffer-fill.js n=20000 size=16384 type='fill("t", "utf8")'                **     11.49 %       ±7.88% ±10.49% ±13.67%
 buffers/buffer-fill.js n=20000 size=16384 type='fill("t", 0, "utf8")'            ***     12.83 %       ±6.09%  ±8.15% ±10.70%
 buffers/buffer-fill.js n=20000 size=16384 type='fill("t", 0)'                    ***     16.12 %       ±6.62%  ±8.87% ±11.66%
 buffers/buffer-fill.js n=20000 size=16384 type='fill("t")'                       ***     13.09 %       ±5.81%  ±7.76% ±10.17%
 buffers/buffer-fill.js n=20000 size=16384 type='fill("test")'                     **      7.51 %       ±4.88%  ±6.50%  ±8.47%
 buffers/buffer-fill.js n=20000 size=16384 type='fill(0)'                         ***     16.09 %       ±7.95% ±10.60% ±13.86%
 buffers/buffer-fill.js n=20000 size=16384 type='fill(100)'                       ***     18.08 %       ±8.67% ±11.56% ±15.07%
 buffers/buffer-fill.js n=20000 size=16384 type='fill(400)'                       ***     16.70 %       ±7.64% ±10.17% ±13.25%
 buffers/buffer-fill.js n=20000 size=16384 type='fill(Buffer.alloc(1), 0)'                 1.10 %       ±7.26%  ±9.65% ±12.57%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("")'                                 3.66 %       ±4.81% ±6.45% ±8.48%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("t", "utf8")'                **      8.65 %       ±5.08% ±6.76% ±8.81%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("t", 0, "utf8")'             **      4.75 %       ±3.31% ±4.41% ±5.76%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("t", 0)'                             2.65 %       ±4.13% ±5.50% ±7.17%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("t")'                         *      5.92 %       ±5.14% ±6.90% ±9.10%
 buffers/buffer-fill.js n=20000 size=32768 type='fill("test")'                             3.32 %       ±4.13% ±5.50% ±7.18%
 buffers/buffer-fill.js n=20000 size=32768 type='fill(0)'                                  3.52 %       ±4.73% ±6.29% ±8.19%
 buffers/buffer-fill.js n=20000 size=32768 type='fill(100)'                               -0.48 %       ±3.06% ±4.07% ±5.31%
 buffers/buffer-fill.js n=20000 size=32768 type='fill(400)'                         *      4.06 %       ±4.01% ±5.34% ±6.95%
 buffers/buffer-fill.js n=20000 size=32768 type='fill(Buffer.alloc(1), 0)'                -1.73 %       ±3.65% ±4.88% ±6.38%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("")'                          *      1.77 %       ±1.33% ±1.78% ±2.33%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("t", "utf8")'                        0.82 %       ±1.31% ±1.75% ±2.28%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("t", 0, "utf8")'                    -0.97 %       ±1.56% ±2.08% ±2.72%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("t", 0)'                             0.72 %       ±1.69% ±2.25% ±2.95%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("t")'                               -0.19 %       ±1.64% ±2.19% ±2.87%
 buffers/buffer-fill.js n=20000 size=65536 type='fill("test")'                            -0.81 %       ±2.34% ±3.14% ±4.14%
 buffers/buffer-fill.js n=20000 size=65536 type='fill(0)'                                  1.10 %       ±2.34% ±3.12% ±4.06%
 buffers/buffer-fill.js n=20000 size=65536 type='fill(100)'                                0.40 %       ±0.87% ±1.16% ±1.51%
 buffers/buffer-fill.js n=20000 size=65536 type='fill(400)'                         *      1.41 %       ±1.39% ±1.85% ±2.41%
 buffers/buffer-fill.js n=20000 size=65536 type='fill(Buffer.alloc(1), 0)'                -0.09 %       ±1.94% ±2.58% ±3.36%

vsemozhetbyt · 2018-02-15T16:28:20Z

doc/api/buffer.md

@@ -1228,6 +1228,9 @@ console.log(buf1.equals(buf3));
 <!-- YAML
 added: v0.5.0
 changes:
+  - version: REPLACEME
+    pr-url: https://github.com/nodejs/node/pull/REPLACEME
+    description: Negative `end` values throw an `ERR_INDEX_OF_OUT_RANGE` error.


OF_OUT -> OUT_OF :)

BridgeAR · 2018-02-16T14:36:34Z

CI https://ci.nodejs.org/job/node-test-pull-request/13201/
~~CITGM https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/1294/~~
CITGM https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/1303/

mcollina

LGTM

1) This improves the performance for Buffer#fill by using shortcuts. 2) It also ports throwing errors to JS. That way they contain the proper error code. 3) Using negative `end` values will from now on result in an error instead of just doing nothing. 4) Passing in `null` as encoding is from now on accepted as 'utf8'.

This focuses on the common case by making sure they are prioritized. It also changes some typeof checks to test for undefined since that is faster and it adds a benchmark.

Due to a consolidation the isEncoding function got less strict in version 5.x.x. This commit makes sure we do not return `true` for empty strings.

BridgeAR · 2018-02-22T14:50:38Z

Rebased due to conflicts.

CI https://ci.nodejs.org/job/node-test-pull-request/13331/

BridgeAR · 2018-03-02T02:09:58Z

Landed in d3af120...452eed9

1) This improves the performance for Buffer#fill by using shortcuts. 2) It also ports throwing errors to JS. That way they contain the proper error code. 3) Using negative `end` values will from now on result in an error instead of just doing nothing. 4) Passing in `null` as encoding is from now on accepted as 'utf8'. PR-URL: nodejs#18790 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>

PR-URL: nodejs#18790 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>

This focuses on the common case by making sure they are prioritized. It also changes some typeof checks to test for undefined since that is faster and it adds a benchmark. PR-URL: nodejs#18790 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>

Due to code consolidation in nodejs#7207 the isEncoding function got less strict. This commit makes sure isEncoding returns false for empty strings as before the consolidation. PR-URL: nodejs#18790 Refs: nodejs#7207 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>

1) This improves the performance for Buffer#fill by using shortcuts. 2) It also ports throwing errors to JS. That way they contain the proper error code. 3) Using negative `end` values will from now on result in an error instead of just doing nothing. 4) Passing in `null` as encoding is from now on accepted as 'utf8'. PR-URL: nodejs#18790 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>

PR-URL: nodejs#18790 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>

This focuses on the common case by making sure they are prioritized. It also changes some typeof checks to test for undefined since that is faster and it adds a benchmark. PR-URL: nodejs#18790 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>

Due to code consolidation in nodejs#7207 the isEncoding function got less strict. This commit makes sure isEncoding returns false for empty strings as before the consolidation. PR-URL: nodejs#18790 Refs: nodejs#7207 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com> Reviewed-By: Matteo Collina <matteo.collina@gmail.com>

ChALkeR · 2018-08-23T21:33:18Z

@nodejs/security thoughts on preventing such changes in the future?
That even had a guard comment which also got removed by this.

My opinion is that a testcase should have been introduced at the same time when the comment was. Or, perhaps, even instead of the comment.

Upd: filed #22492.

BridgeAR added the semver-major PRs that contain breaking changes and should be released in the next major version. label Feb 15, 2018

nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. labels Feb 15, 2018

BridgeAR force-pushed the buffer-fill branch 3 times, most recently from b2f5427 to f082114 Compare February 15, 2018 03:40

jasnell approved these changes Feb 15, 2018

View reviewed changes

vsemozhetbyt reviewed Feb 15, 2018

View reviewed changes

vsemozhetbyt added the performance Issues and PRs related to the performance of Node.js. label Feb 15, 2018

benjamingr approved these changes Feb 15, 2018

View reviewed changes

benjamingr reviewed Feb 15, 2018

View reviewed changes

lib/internal/util.js

}

function slowCases(enc) {

switch (enc.length) {

Copy link

Member

benjamingr Feb 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this improves performance noticeably.

vsemozhetbyt reviewed Feb 15, 2018

View reviewed changes

BridgeAR added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Feb 16, 2018

BridgeAR requested a review from a team February 16, 2018 14:36

mcollina approved these changes Feb 16, 2018

View reviewed changes

BridgeAR added 5 commits February 22, 2018 14:49

benchmark: add buffer fill benchmark

0b416c2

benchmark: rename file

acab8bd

lib: improve normalize encoding performance

f8249bb

This focuses on the common case by making sure they are prioritized. It also changes some typeof checks to test for undefined since that is faster and it adds a benchmark.

buffer: stricter isEncoding

c5aa244

Due to a consolidation the isEncoding function got less strict in version 5.x.x. This commit makes sure we do not return `true` for empty strings.

BridgeAR force-pushed the buffer-fill branch from 332ac6e to c5aa244 Compare February 22, 2018 14:50

BridgeAR closed this Mar 2, 2018

This was referenced Aug 23, 2018

buffer: add Buffer.from(), Buffer.alloc() and Buffer.allocUnsafe(), soft-deprecate Buffer(num) #4682

Closed

Add testcases for all documented safeguards #22492

Open

BridgeAR deleted the buffer-fill branch April 1, 2019 23:38

BridgeAR mentioned this pull request Dec 20, 2019

for ... of replacing for(;;) in library networking code? #31024

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer: improve fill & normalizeEncoding performance #18790

buffer: improve fill & normalizeEncoding performance #18790

BridgeAR commented Feb 15, 2018 •

edited

Loading

jasnell left a comment

vsemozhetbyt Feb 15, 2018

benjamingr left a comment

benjamingr Feb 15, 2018

BridgeAR commented Feb 15, 2018

benjamingr commented Feb 15, 2018

BridgeAR commented Feb 15, 2018

vsemozhetbyt Feb 15, 2018

BridgeAR commented Feb 16, 2018 •

edited

Loading

mcollina left a comment

BridgeAR commented Feb 22, 2018

BridgeAR commented Mar 2, 2018

ChALkeR commented Aug 23, 2018 •

edited

Loading

buffer: improve fill & normalizeEncoding performance #18790

buffer: improve fill & normalizeEncoding performance #18790

Conversation

BridgeAR commented Feb 15, 2018 • edited Loading

Checklist

Affected core subsystem(s)

jasnell left a comment

Choose a reason for hiding this comment

vsemozhetbyt Feb 15, 2018

Choose a reason for hiding this comment

benjamingr left a comment

Choose a reason for hiding this comment

benjamingr Feb 15, 2018

Choose a reason for hiding this comment

BridgeAR commented Feb 15, 2018

benjamingr commented Feb 15, 2018

BridgeAR commented Feb 15, 2018

vsemozhetbyt Feb 15, 2018

Choose a reason for hiding this comment

BridgeAR commented Feb 16, 2018 • edited Loading

mcollina left a comment

Choose a reason for hiding this comment

BridgeAR commented Feb 22, 2018

BridgeAR commented Mar 2, 2018

ChALkeR commented Aug 23, 2018 • edited Loading

BridgeAR commented Feb 15, 2018 •

edited

Loading

BridgeAR commented Feb 16, 2018 •

edited

Loading

ChALkeR commented Aug 23, 2018 •

edited

Loading