Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert: fix loose set and map comparison #22495

Closed
wants to merge 3 commits into from

Conversation

BridgeAR
Copy link
Member

The fast path did not anticipate different ways to express a loose
equal string value (e.g., 1n == '+0001'). This is now fixed with the
downside that all primitives that could theoretically have equal
entries must go through a full comparison.

Only strings (partially), symbols, undefined and null can be detected
in a fast path as those entries have a strictly limited set of possible
equal entries.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

The fast path did not anticipate different ways to express a loose
equal string value (e.g., 1n == '+0001'). This is now fixed with the
downside that all primitives that could theoretically have equal
entries must go through a full comparison.

Only strings, symbols, undefined and null can be detected in a fast
path as those entries have a strictly limited set of possible equal
entries.
@nodejs-github-bot nodejs-github-bot added the util Issues and PRs related to the built-in util module. label Aug 24, 2018
// type is a string, number, bigint or boolean. The reason is that those values
// can match lots of different string values (e.g., 1n == '+00001').
function findLooseMatchingPrimitives(prim) {
switch (typeof prim) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this doesn't cause performance issues still?

Copy link
Member Author

@BridgeAR BridgeAR Aug 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does. Significantly for loose comparison for any keys that are primitives that are not null, undefined, symbols and strings that are not loosely equal to any other values.

For strings as primitives that are not loosely equal to numbers:
(A small performance increase)

 assert/deepequal-set.js method='deepEqual_mixed' strict=0 len=500 n=500                           -0.38 %       ±0.93% ±1.24% ±1.62%
 assert/deepequal-set.js method='deepEqual_mixed' strict=1 len=500 n=500                   ***      5.23 %       ±1.89% ±2.53% ±3.30%
 assert/deepequal-set.js method='deepEqual_objectOnly' strict=0 len=500 n=500                      -0.32 %       ±1.25% ±1.67% ±2.19%
 assert/deepequal-set.js method='deepEqual_objectOnly' strict=1 len=500 n=500                       0.89 %       ±2.38% ±3.19% ±4.22%
 assert/deepequal-set.js method='deepEqual_primitiveOnly' strict=0 len=500 n=500                   -0.05 %       ±1.81% ±2.41% ±3.14%
 assert/deepequal-set.js method='deepEqual_primitiveOnly' strict=1 len=500 n=500                   -0.04 %       ±2.02% ±2.70% ±3.53%
 assert/deepequal-set.js method='notDeepEqual_mixed' strict=0 len=500 n=500                ***      2.59 %       ±0.99% ±1.32% ±1.72%
 assert/deepequal-set.js method='notDeepEqual_mixed' strict=1 len=500 n=500                         2.00 %       ±2.38% ±3.17% ±4.12%
 assert/deepequal-set.js method='notDeepEqual_objectOnly' strict=0 len=500 n=500                   -0.25 %       ±0.84% ±1.12% ±1.46%
 assert/deepequal-set.js method='notDeepEqual_objectOnly' strict=1 len=500 n=500                   -0.34 %       ±2.01% ±2.68% ±3.49%
 assert/deepequal-set.js method='notDeepEqual_primitiveOnly' strict=0 len=500 n=500        ***      4.24 %       ±1.89% ±2.52% ±3.28%
 assert/deepequal-set.js method='notDeepEqual_primitiveOnly' strict=1 len=500 n=500                 0.57 %       ±3.65% ±4.87% ±6.37%

For numbers as primitives:
(A significant performance loss for loose not equal checks)

 assert/deepequal-set.js method='deepEqual_mixed' strict=0 len=500 n=500                     *     -4.01 %       ±3.39% ±4.60% ±6.18%
 assert/deepequal-set.js method='deepEqual_mixed' strict=1 len=500 n=500                   ***      4.35 %       ±1.94% ±2.61% ±3.44%
 assert/deepequal-set.js method='deepEqual_primitiveOnly' strict=0 len=500 n=500             *      5.06 %       ±4.38% ±5.93% ±7.91%
 assert/deepequal-set.js method='deepEqual_primitiveOnly' strict=1 len=500 n=500                    0.48 %       ±5.28% ±7.08% ±9.31%
 assert/deepequal-set.js method='notDeepEqual_mixed' strict=0 len=500 n=500                ***    -87.74 %       ±3.30% ±4.49% ±6.05%
 assert/deepequal-set.js method='notDeepEqual_mixed' strict=1 len=500 n=500                        -0.52 %       ±2.56% ±3.44% ±4.55%
 assert/deepequal-set.js method='notDeepEqual_primitiveOnly' strict=0 len=500 n=500        ***    -39.88 %       ±4.70% ±6.31% ±8.34%
 assert/deepequal-set.js method='notDeepEqual_primitiveOnly' strict=1 len=500 n=500                -2.68 %       ±3.50% ±4.72% ±6.28%

Copy link
Member Author

@BridgeAR BridgeAR Aug 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried another approach to overcome the downside but it is simply not possible to absolutely be sure there is no other loosely equal entry.

Now a primitive that could match something else has to go through all entries at least once. Before, it would stop when the entry was found as not having a corresponding entry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was referring to was specifically the use of switch (typeof prim) vs. an if-else ladder. I'm thinking V8 might still not optimize well when typeof is used in this way, because it's being treated as a variable instead of a direct comparison?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there is a tiny difference. I don't think it's significant enough that we should refactor the code. Instead, V8 should just improve it and we'll benefit from it as soon as that lands in Node.

@BridgeAR
Copy link
Member Author

@nodejs/util PTAL

CI https://ci.nodejs.org/job/node-test-pull-request/16760/

Copy link
Member

@benjamingr benjamingr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally missed this in the original!

I'm also not a fan of the regression but given where we started and the fact this is a bug fix I recommend we land this asap and talk optimizing later.

@Trott Trott added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Aug 26, 2018
@BridgeAR
Copy link
Member Author

@benjamingr this code is in core since 8.x.

After thinking about it again it's likely possible by using the former approach in a similar way:

Check for "typical" loose equal entries if no matching one is found and make sure the entry is not already in the cache (primitives may only exist once in a set / as map key). If one exists, cache it. If none exists, search the whole other set / map for the entry. If none exist, fail. Otherwise, add it to the cache and continue.

This approach allows a performance nearer to the original one in all "simple" cases (1 == true) and is worse if all entries are weird string numbers (e.g., '-0000.0' == 0). However, I am not convinced to add a fast path for the loose equality as it's still fast enough and no one should use it anyway. The implementation would just be more complex.

Copy link
Member

@jdalton jdalton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing less on implementation and more on philosophy. I'm 👍 on fixing the bug first then following up on perf because getting the wrong result, but fast isn't helpful. In the context of comparisons I can also see this case, loose map/set comparisons, carrying a reasonable expectation of being less speedy than more strict forms.

case 'string':
const number = +prim;
if (Number.isNaN(number)) {
return false;
Copy link
Member

@jdalton jdalton Aug 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ might pluck the Number.isNaN reference above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment addressed.

@BridgeAR
Copy link
Member Author

@BridgeAR
Copy link
Member Author

@@ -387,12 +387,10 @@ function findLooseMatchingPrimitives(prim) {
case 'symbol':
return false;
case 'string':
const number = +prim;
prim = +prim;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge fan of this change (makes it harder to follow IMO) but still LGTM on the PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I change it back?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong feelings - I'm just not a fan of this sort of assignment since it takes another extra step to follow - but you can absolutely land as is if you want.

BridgeAR added a commit to BridgeAR/node that referenced this pull request Sep 4, 2018
The fast path did not anticipate different ways to express a loose
equal string value (e.g., 1n == '+0001'). This is now fixed with the
downside that all primitives that could theoretically have equal
entries must go through a full comparison.

Only some strings, symbols, undefined, null and NaN can be detected
in a fast path as those entries have a strictly limited set of
possible equal entries.

PR-URL: nodejs#22495
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: John-David Dalton <john.david.dalton@gmail.com>
@BridgeAR
Copy link
Member Author

BridgeAR commented Sep 4, 2018

Landed in be5e396 🎉

@BridgeAR BridgeAR closed this Sep 4, 2018
targos pushed a commit that referenced this pull request Sep 5, 2018
The fast path did not anticipate different ways to express a loose
equal string value (e.g., 1n == '+0001'). This is now fixed with the
downside that all primitives that could theoretically have equal
entries must go through a full comparison.

Only some strings, symbols, undefined, null and NaN can be detected
in a fast path as those entries have a strictly limited set of
possible equal entries.

PR-URL: #22495
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: John-David Dalton <john.david.dalton@gmail.com>
targos pushed a commit that referenced this pull request Sep 6, 2018
The fast path did not anticipate different ways to express a loose
equal string value (e.g., 1n == '+0001'). This is now fixed with the
downside that all primitives that could theoretically have equal
entries must go through a full comparison.

Only some strings, symbols, undefined, null and NaN can be detected
in a fast path as those entries have a strictly limited set of
possible equal entries.

PR-URL: #22495
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: John-David Dalton <john.david.dalton@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. util Issues and PRs related to the built-in util module.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants