v.2.0-proposal #3

BridgeAR · 2016-05-08T17:27:01Z

This is a complete rewrite of the JS parser done by @Salakar with additional work by me.

As the hiredis parser is now likely slower in every case, I went ahead and deprecated it and removed any hints about it in the documentation. It's still usable by using the undocumented name option, but it'll print a warning in that case.

All errors returned by the parser are from now on "ReplyErrors" with a limited stack, as the stack is not meaningful at all and requires computation time.

OLD CODE: multiple chunks in a bulk string x 57,774 ops/sec ±1.07% (91 runs sampled)
HIREDIS: multiple chunks in a bulk string x 899,793 ops/sec ±0.75% (90 runs sampled)
NEW CODE: multiple chunks in a bulk string x 1,618,283 ops/sec ±0.87% (89 runs sampled)
NEW BUF: multiple chunks in a bulk string x 850,057 ops/sec ±0.86% (93 runs sampled)

OLD CODE: multiple chunks in a string x 125,927 ops/sec ±0.91% (93 runs sampled)
HIREDIS: multiple chunks in a string x 1,730,827 ops/sec ±0.46% (95 runs sampled)
NEW CODE: multiple chunks in a string x 1,581,035 ops/sec ±0.59% (91 runs sampled)
NEW BUF: multiple chunks in a string x 1,366,036 ops/sec ±0.61% (93 runs sampled)

OLD CODE: 4mb bulk string x 248 ops/sec ±1.46% (79 runs sampled)
HIREDIS: 4mb bulk string x 362 ops/sec ±1.70% (75 runs sampled)
NEW CODE: 4mb bulk string x 897 ops/sec ±1.52% (77 runs sampled)
NEW BUF: 4mb bulk string x 1,586 ops/sec ±1.77% (90 runs sampled)

OLD CODE: + simple string x 2,546,973 ops/sec ±0.67% (90 runs sampled)
HIREDIS: + simple string x 2,312,779 ops/sec ±0.56% (94 runs sampled)
NEW CODE: + simple string x 4,537,204 ops/sec ±0.83% (91 runs sampled)
NEW BUF: + simple string x 3,016,757 ops/sec ±0.73% (92 runs sampled)

OLD CODE: + integer x 2,728,872 ops/sec ±0.64% (92 runs sampled)
HIREDIS: + integer x 2,191,210 ops/sec ±0.51% (93 runs sampled)
NEW CODE: + integer x 15,273,412 ops/sec ±0.90% (89 runs sampled)
NEW STR: + integer x 6,837,573 ops/sec ±1.06% (87 runs sampled)

OLD CODE: + big integer x 1,786,652 ops/sec ±0.60% (94 runs sampled)
HIREDIS: + big integer x 2,041,817 ops/sec ±0.45% (95 runs sampled)
NEW CODE: + big integer x 7,073,548 ops/sec ±0.99% (89 runs sampled)
NEW STR: + big integer x 3,094,097 ops/sec ±0.99% (88 runs sampled)

OLD CODE: * array x 1,294,499 ops/sec ±0.63% (94 runs sampled)
HIREDIS: * array x 941,297 ops/sec ±0.52% (96 runs sampled)
NEW CODE: * array x 4,037,489 ops/sec ±0.81% (86 runs sampled)
NEW BUF: * array x 2,635,257 ops/sec ±0.68% (90 runs sampled)

OLD CODE: * bigArray x 35,453 ops/sec ±0.74% (96 runs sampled)
HIREDIS: * bigArray x 38,992 ops/sec ±0.68% (91 runs sampled)
NEW CODE: * bigArray x 67,108 ops/sec ±0.65% (95 runs sampled)
NEW BUF: * bigArray x 39,251 ops/sec ±0.81% (92 runs sampled)

OLD CODE: * error x 170,299 ops/sec ±1.11% (85 runs sampled)
HIREDIS: * error x 88,664 ops/sec ±0.83% (92 runs sampled)
NEW CODE: * error x 179,716 ops/sec ±0.72% (92 runs sampled)

The benchmark was run on a Lenovo T450s with an i7 on NodeJS 6.1.

Supersedes and closes #2

Edit: I updated the benchmark numbers and the the comment about removing hiredis, as it's currently only deprecated to keep backwards compatibility.

…rformance. Almost 3x faster.

…ly...

Handle big data in constant time again Small performance improvements here and there Add ReplyError subclass

Don't use `continue` Make sure the returned strings are always proper utf8 values Make sure the jit knows has as much information as possible Don't return a bulk string before `\n` is present Improve hiredis warnings

Salakar · 2016-05-08T17:50:22Z

LGTM

luin · 2016-05-09T03:41:04Z

I ran the new parser with ioredis, and the benchmark result is impressive. Awesome work!

However, when parsing a big array, new parser seems to be slower than the hiredis:

var bigArraySize = 100;
var bigArray = '*' + bigArraySize + '\r\n';
for (var i = 0; i < bigArraySize; i++) {
  bigArray += '$';
  var size = (Math.random() * 10 | 0) + 1;
  bigArray += size + '\r\n' + lorem.slice(0, size) + '\r\n';
}

var bigArrayList = bigArray.split('\r\n').map(function (str) {
  return new Buffer(str + '\r\n');
}).slice(0, -1);

// BIG ARRAYS

suite.add('\nOLD CODE: * bigArray', function () {
  for (var i = 0; i < bigArrayList.length; i++) {
    parserOld.execute(bigArrayList[i]);
  }
})

suite.add('HIREDIS: * bigArray', function () {
  for (var i = 0; i < bigArrayList.length; i++) {
    parserHiRedis.execute(bigArrayList[i]);
  }
})

suite.add('NEW CODE: * bigArray', function () {
  for (var i = 0; i < bigArrayList.length; i++) {
    parser.execute(bigArrayList[i]);
  }
})

The result:

OLD CODE: * bigArray x 163 ops/sec ±2.73% (71 runs sampled)
HIREDIS: * bigArray x 13,199 ops/sec ±0.96% (88 runs sampled)
NEW CODE: * bigArray x 392 ops/sec ±2.97% (82 runs sampled)

luin · 2016-05-09T05:30:41Z

I double checked my code above and found a mistake in it. I've updated the code and the result in the above comment. It turns out the new parser is faster than the old one, but slower than hiredis.

BridgeAR · 2016-05-09T05:32:20Z

The array example is not good, as all the \r\n are removed and it also mixes the array itself with lots of small chunks that have to be collected first.

Edit: :D

It is going to increase in size if required

This improves bulk strings with multiple chunks significantly

Also free the bufferPool over time again

BridgeAR · 2016-05-17T11:28:53Z

I updated the benchmark numbers and they look very promising to me 😃

I'm still wondering why NEW CODE: + simple string has decreased for me from 5 million to 4.5 million. That is the only drop with my further improvements (mainly about multiple chunks).

Currently I think it's difficult to get much better results than the ones we reached. @Salakar @luin do you mind doing another code review? I can move the new commits to another branch if you wish.

Salakar · 2016-05-17T12:00:02Z

lib/parser.js

 var bufferPool = new Buffer(64 * 1024)
+var interval = null


Wouldn't this and bufferPool cause inconsistencies if running more than one parser as they are module scoped. Perhaps move into constructor?

Edit: nvm, not necessary

BridgeAR · 2016-05-17T12:14:54Z

PING:          30220 ->  35188 ops/sec (∆ +   4968 +  16.44%)
PING:         560116 -> 678209 ops/sec (∆ + 118093 +  21.08%)
SET 4B str:    26703 ->  29798 ops/sec (∆ +   3095 +  11.59%)
SET 4B str:   399240 -> 439404 ops/sec (∆ +  40164 +  10.06%)
SET 4B buf:    15639 ->  16471 ops/sec (∆ +    832 +   5.32%)
SET 4B buf:   168772 -> 186605 ops/sec (∆ +  17833 +  10.57%)
GET 4B str:    30062 ->  31359 ops/sec (∆ +   1297 +   4.31%)
GET 4B str:   466553 -> 551120 ops/sec (∆ +  84567 +  18.13%)
GET 4B buf:    30314 ->  31256 ops/sec (∆ +    942 +   3.11%)
GET 4B buf:   462195 -> 554638 ops/sec (∆ +  92443 +  20.00%)
SET 4KiB str:  25856 ->  26534 ops/sec (∆ +    678 +   2.62%)
SET 4KiB str: 138105 -> 140484 ops/sec (∆ +   2379 +   1.72%)
SET 4KiB buf:  17436 ->  16626 ops/sec (∆ -    810 -   4.65%)
SET 4KiB buf: 122551 -> 129688 ops/sec (∆ +   7137 +   5.82%)
GET 4KiB str:  27888 ->  29539 ops/sec (∆ +   1651 +   5.92%)
GET 4KiB str: 136985 -> 203019 ops/sec (∆ +  66034 +  48.21%)
GET 4KiB buf:  27845 ->  29819 ops/sec (∆ +   1974 +   7.09%)
GET 4KiB buf: 138705 -> 204418 ops/sec (∆ +  65713 +  47.38%)
INCR:          30662 ->  31878 ops/sec (∆ +   1216 +   3.97%)
INCR:         454738 -> 572971 ops/sec (∆ + 118233 +  26.00%)
LPUSH:         29340 ->  30960 ops/sec (∆ +   1620 +   5.52%)
LPUSH:        388105 -> 504258 ops/sec (∆ + 116153 +  29.93%)
LRANGE 10:     27032 ->  27960 ops/sec (∆ +    928 +   3.43%)
LRANGE 10:    147841 -> 218313 ops/sec (∆ +  70472 +  47.67%)
LRANGE 100:    13812 ->  18485 ops/sec (∆ +   4673 +  33.83%)
LRANGE 100:    20144 ->  31267 ops/sec (∆ +  11123 +  55.22%)
SET 4MiB str:    151 ->    167 ops/sec (∆ +     16 +  10.60%)
SET 4MiB str:    164 ->    214 ops/sec (∆ +     50 +  30.49%)
SET 4MiB buf:    342 ->    436 ops/sec (∆ +     94 +  27.49%)
SET 4MiB buf:    368 ->    457 ops/sec (∆ +     89 +  24.18%)
GET 4MiB str:    158 ->    374 ops/sec (∆ +    216 + 136.71%)
GET 4MiB str:    154 ->    188 ops/sec (∆ +     34 +  22.08%)
GET 4MiB buf:    152 ->    383 ops/sec (∆ +    231 + 151.97%)
GET 4MiB buf:    156 ->    185 ops/sec (∆ +     29 +  18.59%)

Salakar · 2016-05-17T13:16:06Z

As per our call, this is all good, so I say LGTM and released, unless @luin has any issues?

Further perf tweaks can be done in v2 minor versions if needed.

luin · 2016-05-17T16:46:51Z

I just ran the benchmark and it looks good to me. 👍

Salakar · 2016-05-17T16:47:34Z

@luin what's the perf increases like on ioredis?

luin · 2016-05-17T17:10:42Z

@Salakar I didn't run it with ioredis. I just ran npm run benchmark of the parser. Here's my result:

OLD CODE: multiple chunks in a bulk string x 64,813 ops/sec ±0.78% (90 runs sampled)
HIREDIS: multiple chunks in a bulk string x 853,105 ops/sec ±0.64% (92 runs sampled)
NEW CODE: multiple chunks in a bulk string x 1,294,377 ops/sec ±0.80% (93 runs sampled)
NEW BUF: multiple chunks in a bulk string x 557,270 ops/sec ±2.72% (84 runs sampled)

OLD CODE: multiple chunks in a string x 100,941 ops/sec ±3.04% (82 runs sampled)
HIREDIS: multiple chunks in a string x 1,649,325 ops/sec ±1.64% (90 runs sampled)
NEW CODE: multiple chunks in a string x 843,755 ops/sec ±0.89% (92 runs sampled)
NEW BUF: multiple chunks in a string x 485,696 ops/sec ±7.20% (64 runs sampled)

OLD CODE: 4mb bulk string x 139 ops/sec ±3.91% (71 runs sampled)
HIREDIS: 4mb bulk string x 126 ops/sec ±6.80% (61 runs sampled)
NEW CODE: 4mb bulk string x 1,115 ops/sec ±3.81% (79 runs sampled)
NEW BUF: 4mb bulk string x 2,259 ops/sec ±3.14% (85 runs sampled)

OLD CODE: + simple string x 264,635 ops/sec ±6.05% (76 runs sampled)
HIREDIS: + simple string x 2,124,393 ops/sec ±2.77% (84 runs sampled)
NEW CODE: + simple string x 549,101 ops/sec ±2.07% (90 runs sampled)
NEW BUF: + simple string x 465,949 ops/sec ±2.78% (89 runs sampled)

OLD CODE: + integer x 682,709 ops/sec ±1.79% (86 runs sampled)
HIREDIS: + integer x 2,077,508 ops/sec ±3.23% (91 runs sampled)
NEW CODE: + integer x 1,325,230 ops/sec ±0.39% (93 runs sampled)
NEW STR: + integer x 4,502,721 ops/sec ±0.82% (92 runs sampled)

OLD CODE: + big integer x 324,049 ops/sec ±0.63% (91 runs sampled)
HIREDIS: + big integer x 1,991,291 ops/sec ±2.05% (91 runs sampled)
NEW CODE: + big integer x 613,294 ops/sec ±1.74% (91 runs sampled)
NEW STR: + big integer x 2,618,784 ops/sec ±2.24% (93 runs sampled)

OLD CODE: * array x 566,616 ops/sec ±2.53% (86 runs sampled)
HIREDIS: * array x 752,228 ops/sec ±5.04% (73 runs sampled)
NEW CODE: * array x 803,021 ops/sec ±6.25% (84 runs sampled)
NEW BUF: * array x 688,835 ops/sec ±2.95% (87 runs sampled)

OLD CODE: * bigArray x 16,244 ops/sec ±0.79% (92 runs sampled)
HIREDIS: * bigArray x 43,471 ops/sec ±0.80% (94 runs sampled)
NEW CODE: * bigArray x 20,995 ops/sec ±0.62% (93 runs sampled)
NEW BUF: * bigArray x 13,931 ops/sec ±2.18% (91 runs sampled)

OLD CODE: * error x 123,363 ops/sec ±2.25% (86 runs sampled)
HIREDIS: * error x 94,347 ops/sec ±2.51% (88 runs sampled)
NEW CODE: * error x 147,233 ops/sec ±3.63% (80 runs sampled)

I think the new parser is fast enough to replace hiredis.

Salakar · 2016-05-22T18:04:07Z

Are we ready to go with this? 😸

Any objections to me merging and publishing a new version?

@luin might potentially a good time to fit this into ioredis v2 before it's release?

BridgeAR · 2016-05-22T18:25:02Z

No, there are regressions left while concatenating bulk strings that I addressed locally but I want to rewrite it a bit more but currently I'm occupied by hack 4 humanity.

luin · 2016-05-26T15:35:40Z

@Salakar I'm going to do more tests this weekend. If the new parser is fast enough that we don't need to introduce dropBufferSupport, I'll definitely wait for it. 😆

Salakar and others added 18 commits May 5, 2016 19:21

Complete rewrite of javascript redis parser, heavily optimised for pe…

9a0016a

…rformance. Almost 3x faster.

Benchmark dev deps

e855650

small rewrite of hiredis parser - around 3% perf boost, not much real…

f6da450

…ly...

npm ignore benchmarks

be81ef6

git ignore .idea

bf07776

benchmarks

a24e8c3

fixed incorrect ioredis link

365b0d0

offset string extract on larger buffers

4e9ff43

Improve the js parser further

adc47d0

Handle big data in constant time again Small performance improvements here and there Add ReplyError subclass

Add big data to benchmark

1184bd6

Remove hiredis support

2e3ec4a

Update npmignore file

742226e

Replace jshint with standard

298d074

Warn in case of using the name option with hiredis set

f6cd8cc

Fix jsdoc statement

16534da

Add support for NodeJS 6

b245cd6

Further optimizations

1a974c6

Don't use `continue` Make sure the returned strings are always proper utf8 values Make sure the jit knows has as much information as possible Don't return a bulk string before `\n` is present Improve hiredis warnings

Add istanbul ignore statement for the hiredis warnings

7b76bfc

Salakar mentioned this pull request May 9, 2016

Rewritten parsers - ~3x performance increase #2

Closed

Ruben Bridgewater added 7 commits May 9, 2016 07:48

Add big arrays benchmark

3061e85

Improve number parsing

5bc1708

Fix error regression to contain buffers instead of strings

534deff

Add testcase for the error regression

ad578bf

Improve number parsing further

189444a

Improve string parsing minimal

d754952

Refactor some lines away

0471209

Ruben Bridgewater added 8 commits May 17, 2016 09:10

Improve multiple chunks being concated by using a bufferPool

304ca09

It is going to increase in size if required

Fix bulk string offset cache

2642f60

This improves bulk strings with multiple chunks significantly

Add jsdoc comment

4001891

Add additional benchmarks for returning buffers and stringNumbers

6f82d62

Improve parsing very big chunks

200a8cf

Also free the bufferPool over time again

Improve options validation

d3c5cf6

Move hiredis back into the lib folder to keep it deprecated

d12bc97

Always run istanbul directly

decf592

Salakar added 2 commits May 17, 2016 13:34

Update README.md

9151ae0

Update README.md

6f24430

Salakar reviewed May 17, 2016
View reviewed changes

Improve input validation error message

86de5a0

Fix bulk string concat

16d7e7a

BridgeAR force-pushed the v.2.0 branch from f6fb571 to 16d7e7a Compare May 23, 2016 12:45

Refactor some code to reduce code size

f4fd340

BridgeAR merged commit 4418ac6 into master May 28, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v.2.0-proposal #3

v.2.0-proposal #3

BridgeAR commented May 8, 2016 •

edited

Loading

Salakar commented May 8, 2016

luin commented May 9, 2016 •

edited

Loading

luin commented May 9, 2016

BridgeAR commented May 9, 2016 •

edited

Loading

BridgeAR commented May 17, 2016

Salakar May 17, 2016 •

edited

Loading

BridgeAR commented May 17, 2016

Salakar commented May 17, 2016

luin commented May 17, 2016

Salakar commented May 17, 2016

luin commented May 17, 2016

Salakar commented May 22, 2016 •

edited

Loading

BridgeAR commented May 22, 2016

luin commented May 26, 2016

v.2.0-proposal #3

v.2.0-proposal #3

Conversation

BridgeAR commented May 8, 2016 • edited Loading

Salakar commented May 8, 2016

luin commented May 9, 2016 • edited Loading

luin commented May 9, 2016

BridgeAR commented May 9, 2016 • edited Loading

BridgeAR commented May 17, 2016

Salakar May 17, 2016 • edited Loading

Choose a reason for hiding this comment

BridgeAR commented May 17, 2016

Salakar commented May 17, 2016

luin commented May 17, 2016

Salakar commented May 17, 2016

luin commented May 17, 2016

Salakar commented May 22, 2016 • edited Loading

BridgeAR commented May 22, 2016

luin commented May 26, 2016

BridgeAR commented May 8, 2016 •

edited

Loading

luin commented May 9, 2016 •

edited

Loading

BridgeAR commented May 9, 2016 •

edited

Loading

Salakar May 17, 2016 •

edited

Loading

Salakar commented May 22, 2016 •

edited

Loading