Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: optimize byteLength for short strings #54345

Merged
merged 1 commit into from
Aug 14, 2024

Conversation

ronag
Copy link
Member

@ronag ronag commented Aug 12, 2024

M2

buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='four_bytes'                  5.08 %      ±27.63% ±43.80%  ±75.96%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='one_byte'                    9.92 %      ±43.25% ±67.27% ±112.98%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='three_bytes'                12.47 %      ±31.61% ±49.92%  ±86.02%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='two_bytes'                  34.02 %      ±52.38% ±76.67% ±116.30%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='four_bytes'          **     82.80 %      ±52.56% ±76.65% ±115.59%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='one_byte'                    5.31 %      ±48.31% ±70.43% ±106.17%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='three_bytes'         **     60.17 %      ±34.55% ±50.81%  ±77.68%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='two_bytes'           **     68.52 %      ±32.32% ±48.53%  ±76.85%

@ronag
Copy link
Member Author

ronag commented Aug 12, 2024

@lemire

@ronag ronag requested a review from lemire August 12, 2024 21:10
@nodejs-github-bot nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Aug 12, 2024
@ronag ronag force-pushed the bytelength-int64 branch 5 times, most recently from 55227cf to 6506c2a Compare August 12, 2024 21:21
@ronag
Copy link
Member Author

ronag commented Aug 12, 2024

@lemire couldn't utf8_length_from_latin1 do this?

@ronag
Copy link
Member Author

ronag commented Aug 12, 2024

Copy link

codecov bot commented Aug 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.09%. Comparing base (298ff4f) to head (e204e9e).
Report is 436 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #54345      +/-   ##
==========================================
- Coverage   87.09%   87.09%   -0.01%     
==========================================
  Files         647      648       +1     
  Lines      181816   182234     +418     
  Branches    34884    34964      +80     
==========================================
+ Hits       158360   158717     +357     
- Misses      16764    16791      +27     
- Partials     6692     6726      +34     
Files with missing lines Coverage Δ
src/node_buffer.cc 75.55% <100.00%> (+0.36%) ⬆️

... and 53 files with indirect coverage changes

@lemire
Copy link
Member

lemire commented Aug 13, 2024

@ronag

couldn't utf8_length_from_latin1 do this?

Fair question.

Please see simdutf/simdutf#499

src/node_buffer.cc Outdated Show resolved Hide resolved
@lemire
Copy link
Member

lemire commented Aug 13, 2024

Long term, it could pay to make simdutf faster in these cases, because we have the runtime dispatching in place. But this PR, as is, should be fine.

@ronag ronag force-pushed the bytelength-int64 branch from 6506c2a to a608ae1 Compare August 13, 2024 06:14
ronag added a commit to nxtedition/node that referenced this pull request Aug 13, 2024
@ronag ronag force-pushed the bytelength-int64 branch from a608ae1 to 110e88b Compare August 13, 2024 06:17
ronag added a commit to nxtedition/node that referenced this pull request Aug 13, 2024
@ronag ronag force-pushed the bytelength-int64 branch from 110e88b to b97e1f1 Compare August 13, 2024 06:20
ronag added a commit to nxtedition/node that referenced this pull request Aug 13, 2024
@ronag ronag force-pushed the bytelength-int64 branch from b97e1f1 to 0fedfdb Compare August 13, 2024 06:21
ronag added a commit to nxtedition/node that referenced this pull request Aug 13, 2024
@ronag ronag force-pushed the bytelength-int64 branch from 0fedfdb to 1d66f25 Compare August 13, 2024 06:23
ronag added a commit to nxtedition/node that referenced this pull request Aug 13, 2024
@ronag ronag force-pushed the bytelength-int64 branch from 1d66f25 to 6e3ab46 Compare August 13, 2024 06:23
@ronag
Copy link
Member Author

ronag commented Aug 13, 2024

@ronag ronag added the request-ci Add this label to start a Jenkins CI on a PR. label Aug 13, 2024
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Aug 13, 2024
@nodejs-github-bot
Copy link
Collaborator

ronag added a commit to nxtedition/node that referenced this pull request Aug 13, 2024
@ronag ronag force-pushed the bytelength-int64 branch from a9325a3 to dc6d0a4 Compare August 13, 2024 13:18
@ronag ronag requested a review from benjamingr August 13, 2024 13:19
@ronag ronag added the request-ci Add this label to start a Jenkins CI on a PR. label Aug 13, 2024
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Aug 13, 2024
@nodejs-github-bot
Copy link
Collaborator

Comment on lines +769 to +779
for (; i + 32 <= length; i += 32) {
uint64_t v;
memcpy(&v, input + i, 8);
answer += pop(v);
memcpy(&v, input + i + 8, 8);
answer += pop(v);
memcpy(&v, input + i + 16, 8);
answer += pop(v);
memcpy(&v, input + i + 24, 8);
answer += pop(v);
}
Copy link
Member

@BridgeAR BridgeAR Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really improving the performance? I am surprised that unrolling the loop is beneficial compared to just the lines right afterwards. I would expect the compiler to optimize things like that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just took from simdutf @lemire wdyt?

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

nodejs-github-bot commented Aug 14, 2024

@ronag ronag added the commit-queue Add this label to land a pull request using GitHub Actions. label Aug 14, 2024
@ronag ronag force-pushed the bytelength-int64 branch from dc6d0a4 to e204e9e Compare August 14, 2024 08:37
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

nodejs-github-bot commented Aug 14, 2024

@ronag ronag requested a review from mcollina August 14, 2024 14:16
Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Aug 14, 2024
@nodejs-github-bot nodejs-github-bot merged commit 75d25bc into nodejs:main Aug 14, 2024
49 checks passed
@nodejs-github-bot
Copy link
Collaborator

Landed in 75d25bc

RafaelGSS pushed a commit that referenced this pull request Aug 19, 2024
PR-URL: #54345
Reviewed-By: Daniel Lemire <daniel@lemire.me>
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
@RafaelGSS RafaelGSS mentioned this pull request Aug 19, 2024
RafaelGSS pushed a commit that referenced this pull request Aug 21, 2024
PR-URL: #54345
Reviewed-By: Daniel Lemire <daniel@lemire.me>
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
@targos targos added the dont-land-on-v20.x PRs that should not land on the v20.x-staging branch and should not be released in v20.x. label Sep 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. dont-land-on-v20.x PRs that should not land on the v20.x-staging branch and should not be released in v20.x. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants