Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use TextEncoder to encode string if available #68

Merged
merged 1 commit into from
Jul 8, 2019
Merged

Conversation

gfx
Copy link
Member

@gfx gfx commented Jun 16, 2019

This does not affect benchmark results because the dataset for benchmarks does not have large strings, but it is much efficient than pure JS or WASM if the input string is large.

@gfx gfx requested a review from sergeyzenchenko June 16, 2019 13:23
@sergeyzenchenko
Copy link
Collaborator

Have you testes with different sizes?

@sergeyzenchenko
Copy link
Collaborator

@gfx

@gfx
Copy link
Member Author

gfx commented Jun 17, 2019

Yep, as benchmark/encode-string.ts shows:

$ npx ts-node benchmark/encode-string.ts

## string "A" x 10 (byteLength=10)

utf8EncodeJs x 14,402,025 ops/sec ±8.20% (68 runs sampled)
utf8DecodeTE x 843,323 ops/sec ±19.87% (54 runs sampled)

## string "A" x 100 (byteLength=100)

utf8EncodeJs x 1,520,583 ops/sec ±2.55% (86 runs sampled)
utf8DecodeTE x 1,201,906 ops/sec ±4.48% (69 runs sampled)

## string "A" x 200 (byteLength=200)

utf8EncodeJs x 774,931 ops/sec ±1.76% (85 runs sampled)
utf8DecodeTE x 1,071,303 ops/sec ±5.44% (66 runs sampled)

## string "A" x 1000 (byteLength=1000)

utf8EncodeJs x 142,148 ops/sec ±8.10% (72 runs sampled)
utf8DecodeTE x 457,927 ops/sec ±8.05% (44 runs sampled)

## string "A" x 10000 (byteLength=10000)

utf8EncodeJs x 15,303 ops/sec ±3.46% (78 runs sampled)
utf8DecodeTE x 70,942 ops/sec ±9.21% (38 runs sampled)

## string "A" x 100000 (byteLength=100000)

utf8EncodeJs x 1,704 ops/sec ±2.56% (86 runs sampled)
utf8DecodeTE x 8,498 ops/sec ±5.34% (61 runs sampled)

## string "あ" x 10 (byteLength=30)

utf8EncodeJs x 12,520,756 ops/sec ±4.06% (85 runs sampled)
utf8DecodeTE x 1,255,161 ops/sec ±3.06% (70 runs sampled)

## string "あ" x 100 (byteLength=300)

utf8EncodeJs x 940,380 ops/sec ±7.21% (72 runs sampled)
utf8DecodeTE x 698,070 ops/sec ±4.93% (76 runs sampled)

## string "あ" x 200 (byteLength=600)

utf8EncodeJs x 570,138 ops/sec ±3.57% (88 runs sampled)
utf8DecodeTE x 152,060 ops/sec ±25.34% (29 runs sampled)

## string "あ" x 1000 (byteLength=3000)

utf8EncodeJs x 111,823 ops/sec ±8.33% (81 runs sampled)
utf8DecodeTE x 100,644 ops/sec ±10.49% (59 runs sampled)

## string "あ" x 10000 (byteLength=30000)

utf8EncodeJs x 12,831 ops/sec ±2.17% (92 runs sampled)
utf8DecodeTE x 9,405 ops/sec ±14.25% (50 runs sampled)

## string "あ" x 100000 (byteLength=300000)

utf8EncodeJs x 933 ops/sec ±14.97% (70 runs sampled)
utf8DecodeTE x 801 ops/sec ±14.61% (51 runs sampled)

## string "🌏" x 20 (byteLength=40)

utf8EncodeJs x 4,361,021 ops/sec ±23.57% (51 runs sampled)
utf8DecodeTE x 958,067 ops/sec ±8.69% (63 runs sampled)

## string "🌏" x 200 (byteLength=400)

utf8EncodeJs x 612,881 ops/sec ±6.40% (81 runs sampled)
utf8DecodeTE x 339,414 ops/sec ±12.43% (58 runs sampled)

## string "🌏" x 400 (byteLength=800)

utf8EncodeJs x 279,219 ops/sec ±10.39% (71 runs sampled)
utf8DecodeTE x 173,350 ops/sec ±14.61% (55 runs sampled)

## string "🌏" x 2000 (byteLength=4000)

utf8EncodeJs x 27,919 ops/sec ±29.91% (37 runs sampled)
utf8DecodeTE x 41,550 ops/sec ±23.79% (58 runs sampled)

## string "🌏" x 20000 (byteLength=40000)

utf8EncodeJs x 3,842 ops/sec ±22.42% (54 runs sampled)
utf8DecodeTE x 5,670 ops/sec ±4.68% (74 runs sampled)

## string "🌏" x 200000 (byteLength=400000)

utf8EncodeJs x 726 ops/sec ±1.37% (90 runs sampled)
utf8DecodeTE x 640 ops/sec ±3.54% (78 runs sampled)

@gfx gfx merged commit e4cb3ce into master Jul 8, 2019
@gfx gfx deleted the text_encoder branch July 8, 2019 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants