Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: test: force garbage collection on crypto timing test #8203

Closed
wants to merge 7 commits into from

Conversation

Trott
Copy link
Member

@Trott Trott commented Aug 21, 2016

Checklist
  • make -j4 test (UNIX), or vcbuild test nosign (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

test crypto

Description of change

Garbage collection interferes with timing statistics gathering so force
it before beginning the data collection.

Work in progress. Do not merge.

Garbage collection interferes with timing statistics gathering so force
it before beginning the data collection.
@Trott Trott added wip Issues and PRs that are still a work in progress. crypto Issues and PRs related to the crypto subsystem. test Issues and PRs related to the tests. labels Aug 21, 2016
@Trott Trott mentioned this pull request Aug 21, 2016
4 tasks
@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

@not-an-aardvark
Copy link
Contributor

This is mostly what I had in mind, but I was thinking GC would take place before each benchmark rather than before each set of 10000 benchmarks. That might yield more consistent results (although it would cause the test to take awhile).

@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

@not-an-aardvark Well, the way I did it certainly didn't fix the flakiness, so let's try it again. I've moved the global.gc() so it will run before every single benchmark....let's try agian...

CI: https://ci.nodejs.org/job/node-test-pull-request/3771/

@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

Oh my, yes, unsurprisingly, this slows the test waaaay down... times out now...

Might have to try to find a happy medium or something....

@not-an-aardvark
Copy link
Contributor

What's the time limit for the tests? We could reduce the number of trials if necessary; I don't think 10000 trials are necessary to get significant results.

@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

@not-an-aardvark I believe it's 60 seconds (longer on Raspberry Pi devices, though).

@not-an-aardvark
Copy link
Contributor

Changing numTrials to 1000 works locally for me and only takes a couple seconds to run, so maybe we could try that.

@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

Yeah, I just did the same experiment. 8 seconds. Let's try it...

CI: https://ci.nodejs.org/job/node-test-pull-request/3772/

@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

Still flaky unfortunately. Two of the Linux hosts (so far) have failed...

@not-an-aardvark
Copy link
Contributor

I might be looking in the wrong place but it seems like the linux-fips and linux-ppc failures are from the previous run where the test timed out. The tests for the current build are still going and haven't failed yet.

@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

Yeah, I don't look at the widget when running a bunch of tests in rapid succession like this. Here are the tests I'm seeing failing:

https://ci.nodejs.org/job/node-test-commit-linux/4741/nodes=centos6-64/
https://ci.nodejs.org/job/node-test-commit-linux/4741/nodes=ubuntu1604_docker_alpine34-64/

Interestingly, though, those are the only two so far. And that's a much better ratio than we've been seeing. So maybe this is getting close...

@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

Windows failures are Jenkins/build-related and unrelated to the test...other failures have come in though...not sure the gc() helps but happy to try variations to try to make sure...

@not-an-aardvark
Copy link
Contributor

Out of interest, what happens if you run the benchmarks in the other order? i.e. swap lines 66 and 68 here

I've noticed that the t-values have been postive numbers for the most part; if the issue is caused by memory/other external features unrelated to the buffers, then the t-value should flip its sign because the "unequal" benchmark will be affected in the same way as the "equal" benchmark currently is.

@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

Swapped the two lines, re-running CI: https://ci.nodejs.org/job/node-test-pull-request/3773/

@not-an-aardvark
Copy link
Contributor

The tests failed again as expected, but this time all the t-values (see here, here, and here) were negative.

That's good news; it means that in the flaky cases, the average time is higher for whatever comparison comes first in the for loop. This would indicate that the issue is with the tests and is not a security issue with the crypto.timingSafeEqual itself.

@not-an-aardvark
Copy link
Contributor

Could we add this right before the return statement in getTValue?

  if (Math.abs((equalMean - unequalMean) / standardErr) > T_THRESHOLD && compareFunc === crypto.timingSafeEqual) {
    console.log({
      equalMean,
      unequalMean,
      equalLen,
      unequalLen,
      equalStd: standardDeviation(equalBenches),
      unequalStd: standardDeviation(unequalBenches),
      combinedStd,
      standardErr,
      t: (equalMean - unequalMean) / standardErr,
      rawEqualBenches: JSON.stringify(rawEqualBenches),
      rawUnequalBenches: JSON.stringify(rawUnequalBenches)
    });
  }

This will print out all the raw benchmarks if the test fails. I'm interested to see whether the benchmarks are consistently higher for one set, or whether there are just a few benchmarks that are affecting the stats by being different from all the others.

(Sorry about having to keep nagging you to run things on the CI server; this is the type of debugging that would be better done locally if I was able to reproduce the flakiness.)

@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

Added the logging, new CI: https://ci.nodejs.org/job/node-test-pull-request/3777/

(Sorry about having to keep nagging you to run things on the CI server; this is the type of debugging that would be better done locally if I was able to reproduce the flakiness.)

Thank you for your patience with this.

@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

Got a few failures already with the loggin. Here's one: https://ci.nodejs.org/job/node-test-commit-linux/4748/nodes=centos6-64/console

@Trott
Copy link
Member Author

Trott commented Aug 21, 2016

Occurs to me that CI results get wiped after a certain period of time, so for posterity:

# { equalMean: 10534.359,
#   unequalMean: 9611.608,
#   equalLen: 1000,
#   unequalLen: 1000,
#   equalStd: 1079.6274920733736,
#   unequalStd: 1194.2702817376994,
#   combinedStd: 1138.392952253853,
#   standardErr: 50.91048052692576,
#   t: 18.124971331040012,
#   rawEqualBenches: '[10850,12524,11090,11015,11092,11107,10980,11052,11167,11142,11168,11055,17747,10948,11225,11090,10970,10965,12068,11193,11080,11028,11155,11004,11077,11019,11032,11025,10991,11363,11097,15266,11090,11082,11159,11358,11008,11004,10937,11049,10938,11101,10856,11049,10956,14015,11087,11045,10959,11007,10942,10981,11054,10990,11152,11081,11039,11045,10994,11043,10908,10993,11222,10957,11594,10807,10410,10248,10319,10319,10380,10242,10404,10263,10296,10273,10249,15566,10501,10304,10283,10256,10266,10349,10264,10320,10223,10292,10221,10251,10381,10239,10502,10269,10247,10294,10333,10278,10293,10344,10351,10334,10244,10238,10293,10342,10269,10282,10315,10242,10311,10264,10237,10328,10251,10321,10326,10213,10633,10757,10342,10404,10269,10639,10675,11590,10298,10256,10420,10375,10347,10241,10373,10218,10276,10224,10244,10612,10552,10279,10314,10358,10250,10381,10334,10479,10232,10299,10288,10271,10390,10472,10299,10187,10915,10315,10411,10296,10329,10257,10323,10438,10257,10411,10328,10231,10232,10270,10261,10267,10337,10254,10266,10279,10281,10303,10263,10244,10343,10332,10237,10319,10182,10208,10230,10355,10278,10331,10298,10293,10218,10302,10289,10238,10294,10283,10365,10257,10231,10387,16847,10275,10276,10240,10282,10258,10420,10297,10335,10358,10313,10282,10263,10315,10337,10285,10309,10247,10308,10262,10288,10296,10290,10252,10282,10238,10308,10261,10248,10247,10277,16978,10251,10354,10218,10241,10278,10238,10245,10222,10249,10204,10218,10361,10259,10324,10283,10227,16680,10386,10493,10247,10321,10301,10276,10337,10326,10399,10324,10238,10367,10396,10318,10310,10249,10301,10240,10326,10284,10228,10341,10270,10351,10347,10313,10340,10319,10292,10225,10313,10285,10299,10254,10300,10215,10289,10343,10341,10257,10288,10216,10294,10283,10322,10322,10330,10236,10309,10288,10273,10271,10188,10263,10237,10312,10312,10250,10360,10276,17029,10317,10276,10310,10356,10242,10369,10322,10338,10253,10254,10337,10299,10308,10251,16819,10303,10290,10372,10262,10323,10294,10323,10221,10311,10340,10223,10260,10269,10308,10232,10203,10338,17036,10251,10380,10205,10292,10307,10203,10272,10230,10286,10263,10280,10212,10317,10252,10262,10254,10322,10228,10357,10228,10233,10252,10226,10325,10270,10235,10348,10385,10427,10278,10344,10250,10263,10222,10278,10284,10298,10269,10288,10273,10302,10331,10267,10321,10335,10295,10369,10248,10326,10230,10266,10215,10298,10295,10213,10289,10274,10330,10358,10292,10318,10294,10290,10247,10277,10281,10373,10294,10347,10237,10265,10535,10282,10199,10196,10278,10272,10222,10246,16974,10307,10285,10245,10295,10280,10305,10391,10182,10358,10260,10326,10313,10322,10227,10333,10259,10272,10225,10264,10319,10290,10947,10348,10397,10272,10212,10308,10323,10285,10355,10253,10211,10317,10345,10419,10270,10283,10713,10627,10344,10242,10346,10236,10288,10283,10666,10368,10323,10280,10343,10305,10272,10321,10275,10280,10535,10350,10266,10249,10217,10253,10363,10333,10277,10299,10323,10257,10165,10323,10280,10330,10519,10437,10282,10292,10277,10254,10365,10224,10245,10324,10225,10236,10295,10246,10312,10354,10309,10298,10242,10404,10245,10275,10324,10287,10326,10335,10323,10431,10242,10398,10230,10232,10230,10248,10291,10262,10223,10280,10275,10203,10290,10209,10297,10309,10299,10240,10279,10192,10303,10304,17120,10195,10222,10299,10341,10327,10281,10252,10369,10438,10288,10288,10285,10316,10282,10300,10294,10280,10290,10269,10285,10278,16668,10263,10256,10203,10365,10358,10265,10229,10284,10240,10257,10340,10268,10783,10292,10393,10356,10274,10307,10384,10274,10382,10239,10343,10362,10225,10334,10285,10298,10302,10558,10291,10345,10200,10265,10266,10210,10262,10282,10293,10286,10220,10304,10202,10224,10317,10302,10238,17083,10265,10263,10221,10322,10258,10190,10262,10298,10404,17007,10371,10255,10290,10312,10393,10311,10160,10270,10291,10214,10285,10231,10326,10308,10360,10217,10310,10322,10274,10326,10275,10250,10322,10228,10276,10344,10379,10231,10265,10201,10265,10267,10339,10331,10224,10229,10329,10408,10238,10269,10363,10207,10476,10230,10282,10306,10289,10231,10167,10197,10247,10237,10284,10241,10283,10324,10291,10322,10278,10328,10275,10271,10286,10342,10279,10278,10281,10246,10305,10303,10306,10258,10238,10274,10317,10488,10288,10306,10234,10325,10329,10365,10262,10344,10200,10233,10258,10315,10274,10208,10269,10231,10205,10386,10319,10317,10258,10234,10331,10269,10295,10261,10259,10273,10336,10271,10201,10252,10290,10253,10237,10241,10247,10307,10305,10260,10232,10302,10273,10269,10244,10228,10293,10271,10222,14369,10243,10290,10370,10262,10313,16880,10267,10305,10343,10303,10264,10289,10261,10319,10554,10860,10332,10346,10320,10222,10280,10308,10263,10246,10395,10310,10294,10206,10238,10266,10825,10559,10432,10247,10337,10253,10269,10336,10313,10373,10312,10291,10343,10360,10276,10268,10299,15625,10281,10298,10268,10271,10325,10332,10451,10410,10220,10273,10255,10272,10336,10271,10302,23711,10329,10254,10259,10249,10602,10517,10554,10668,15503,12160,10593,10462,14857,12110,10676,10307,10660,13842,10576,10408,10441,10501,10484,10630,10788,10555,12400,10406,10525,10208,10305,10294,10320,10337,10287,16999,10344,10338,10369,10332,10283,10317,10261,10400,10318,16635,10323,10276,10297,10291,10281,10200,10265,10296,10283,10279,10259,10343,10329,10230,10295,10241,10331,17113,10216,10302,10283,10276,10304,10311,10317,10317,10275,10305,10252,10235,10268,10238,10309,10358,10244,10346,10630,10428,10283,10325,10324,10395,10296,10269,10318,10199,10304,10179,10286,10304,10297,11257,10321,10269,10293,10228,10328,10520,10415,10283,10203,10399,10223,10279,10274,10261,10278,10323,10297,10370,10234,10293,10201,10280,10273,10268,10308,10293,10257,10435,10354,10271,10450,10267,10234,10281,10344,10242,10262,10252,10341,10259,10374,10286,10214,10329,10336,10263,16971,10267,10307,10221,16755,10241,10201,10210,10303,10256,10176,10327,10274,10270,10284,10347,10178,11984,10309,10262,10254,10219,10349,10259,10311,10389,10308,10278,10295,10306,10214,10320,10265,10251,10337,10280,10319,10290,10364,10256,10521]',
#   rawUnequalBenches: '[36901,9866,10887,10076,10193,10177,10192,10219,10428,10095,10221,10190,10444,10084,10146,10126,10175,10149,11439,10521,10098,10169,10207,10171,10232,10146,10120,16908,10015,10068,10162,10148,10199,10181,10207,10784,10210,10309,10105,10093,10080,10042,10204,10119,10120,10177,10270,10225,10151,10088,10044,10093,10536,10138,10375,10130,10056,10157,10105,10183,10118,10052,10224,10074,10301,10689,10358,9338,9397,9461,9464,9432,9424,9374,9457,9342,9392,9390,9688,9305,9333,9426,9492,9447,9425,9449,9342,9487,9480,9428,9429,9425,9460,9429,9433,9456,9324,9447,9371,9473,9440,9340,9393,9453,9445,9399,9420,9362,9428,9370,9373,9502,9329,9316,9481,9460,9410,9455,9376,9938,9721,9687,9491,9359,9849,9704,9502,9351,9400,9548,9361,9509,9458,9443,9370,9464,9350,9449,10182,9413,9355,9425,9366,9474,9386,9438,9359,9309,9530,9361,9484,9440,9495,9410,9400,9594,9416,9385,9351,9462,9400,9737,16246,9408,9488,9402,9446,9445,9357,9394,9460,9425,9341,9388,9414,9326,9374,9371,9354,9412,9431,9445,9473,9373,9428,9325,9429,9397,9364,9401,9369,9433,9424,9399,9409,9444,9387,9418,9420,9344,9374,9418,9444,9335,9373,9447,9429,9402,9407,9475,9378,9478,9444,9362,9405,9433,9343,16253,9338,9507,9456,9444,9366,9411,9432,9387,9340,9434,9424,9349,9353,9481,9389,9434,9522,9381,9517,9391,9350,9385,9425,9366,9398,9489,9472,9390,9392,9440,9358,9446,9584,9416,9445,9494,9388,9401,9391,9581,15985,9378,9411,9350,9419,9401,9402,9432,9377,9466,9402,9347,9434,9433,9336,9357,9350,9460,9410,9442,9461,9356,9382,9362,9383,9423,9409,9384,9392,9462,9361,9434,9362,9431,9372,9385,15812,9501,9374,9487,9449,9377,9431,9398,9397,9328,9371,9366,9421,9434,9458,9414,9346,9380,9363,9403,9407,9503,9424,9449,9489,9389,9352,9415,9508,9427,9429,9440,9425,9357,9542,9386,9397,9437,9554,9362,9381,9396,9416,9420,9439,9438,9420,9325,9497,9434,9481,9459,9369,9357,9373,9320,9351,9401,9489,9415,9392,9401,9422,9365,9448,9391,9398,9312,9394,9511,9347,9388,9395,9372,9447,9407,9391,9325,9390,9321,9485,9413,9322,9400,9367,9442,9327,9300,9354,9351,9365,9355,9420,9474,9456,9395,9493,9387,9381,9376,9427,9406,9381,9470,9467,9367,9364,9485,9343,9481,9388,9417,9403,15977,9316,9357,9481,9347,9410,9385,9613,9992,9332,9395,9459,9385,9931,9319,9387,9349,9487,9403,9317,9454,9564,9503,9369,9412,9398,9392,9341,9492,9424,9363,9410,9565,9396,9388,9427,9381,9399,9744,9509,9515,9375,9353,9313,9435,9410,16081,9372,9553,9328,9413,9397,9386,9465,10334,9650,9444,9471,9486,9360,9364,9450,9509,9415,9355,9344,9442,9366,9466,9408,9463,10288,9366,9429,9448,9408,9408,9359,9434,9362,9399,9388,9372,9429,9361,9345,9514,9394,9568,9417,9388,9456,9362,9457,9486,9428,9356,9378,9351,9390,9419,9381,16026,9437,15794,9441,9448,9330,9408,9393,9321,9427,9411,9373,9511,9397,9386,9414,9464,9438,9381,9389,9292,9386,9452,9361,9393,9384,9443,9337,9383,9410,9378,9443,9394,9311,9440,9429,9345,9418,9422,9370,9469,9305,9420,9474,9385,9395,9411,9480,9476,9346,9254,9388,9386,9368,9371,9477,9432,9382,9324,9379,9446,9358,9385,9391,9372,9380,9366,9391,9358,9479,9369,9994,9447,9401,9295,9375,9493,9352,9382,9403,9387,9415,9368,9366,9380,9327,9478,9694,9567,9386,9423,9385,9342,9366,9418,9428,9392,10320,9384,9441,9365,9392,9395,9443,9375,9459,9376,9526,9341,9355,9403,9426,9444,9404,9373,9423,9379,9373,9551,9415,9381,9411,9394,9446,9397,9336,9396,9345,9474,9428,9385,9487,9352,9369,9334,9384,9447,9428,9429,9359,9310,9441,9375,9377,9481,9402,9433,9420,16140,9369,9399,9356,9401,9391,9316,9436,9443,9445,9359,9453,9380,9395,9364,9385,9374,9381,9398,9447,9359,9383,9323,9416,9405,9484,9428,9401,9376,9367,9337,9451,9483,9434,9436,9363,9402,9397,9398,9432,9425,9409,9378,9397,9360,9462,9538,9317,9368,9478,9342,9346,9476,9371,9483,9324,9298,9379,9361,9313,9378,9386,9391,9342,9383,9353,9354,9388,9470,9376,9407,9473,9443,9354,9414,9391,9391,15575,9428,9350,9367,9323,9360,9421,9391,9440,9392,9355,9344,9338,9440,9421,9445,9962,9389,9524,9461,9457,9452,9349,9401,9382,9428,9460,9388,9377,9474,9419,9383,9816,9450,9417,9454,9398,9333,9419,9404,9317,9455,9384,9506,9451,9412,9411,9376,10141,9622,9373,9352,9381,9396,9418,9412,9404,9440,9407,9455,9460,9390,9398,9501,9421,9874,9402,9449,9434,9448,9505,9385,9476,9440,9362,9470,9389,9416,9362,9374,9416,9531,9395,9426,16043,9583,9581,9581,9469,10048,9805,9736,12099,9508,9843,9807,9661,12037,9700,9771,9563,9385,9711,13701,11187,10392,9901,10148,10557,9741,9416,9339,9452,9474,9474,9511,9448,9453,9369,9338,9326,9402,9496,9375,9402,9401,9392,9361,9350,9388,9354,9415,9448,9419,9431,9360,9432,9434,9488,9452,9417,9443,9427,9437,9403,9335,9402,9539,9342,9373,9364,9394,9365,9379,9520,9424,9456,9346,9459,9421,9400,9337,9352,15442,9435,9364,9400,9455,9351,9382,9471,9430,9437,9424,9482,9446,9441,9539,9394,9684,9346,9420,9515,9472,9507,9557,9387,9410,9438,9374,9345,9436,9371,9444,9368,9389,9341,9479,9389,9415,9349,9337,9376,9378,9336,9467,9442,9509,9404,9319,9479,9409,9388,9357,9474,9339,9426,9362,9365,9499,9454,9498,9422,9448,9366,9394,9373,9365,9402,9351,9322,9328,9388,9386,9449,9453,9383,9428,9377,9334,9487,9371,9408,9345,9309,9466,9414,9395,16075,9282,9526,9418,9376,9293,9456,9387,9438,9439,9353,9463,9373,9386,9508,9375,9463,9449]' }

@not-an-aardvark
Copy link
Contributor

not-an-aardvark commented Aug 22, 2016

Another possibility: the issue could be because V8 is optimizing the runBenchmark for one set of params, and ends up having to deoptimize it/do something else when the function is passed the other set of params. If this is the issue, it would be solved by having two nearly-identical runBenchmark functions, one for equal buffers and one for unequal buffers.

I made a branch to test this on my fork; could we run the CI on that (or pull it into the test branch on this PR)?

@not-an-aardvark
Copy link
Contributor

Yeah that's a bit worrying.

Since there seems to be a chance that this is a security issue rather than a test issue, should we revert the timingSafeEqual commit for now since it hasn't appeared in a release yet? I imagine it would be better to delay the feature than to risk having a release with an insecure timingSafeEqual function, if it turns out that the function actually is insecure.

@Trott
Copy link
Member Author

Trott commented Aug 22, 2016

Since there seems to be a chance that this is a security issue rather than a test issue, should we revert the timingSafeEqual commit for now since it hasn't appeared in a release yet?

That would be my inclination.

Looping in @nodejs/crypto ...

@Trott
Copy link
Member Author

Trott commented Aug 22, 2016

@jasnell

@jasnell
Copy link
Member

jasnell commented Aug 22, 2016

Ugh. Yeah, I'd agree. That's unfortunate.

jasnell added a commit to jasnell/node that referenced this pull request Aug 22, 2016
This reverts commit 0fc5e0d.

Additional testing indicates that there may still be timing issues
with this implementation. Revert in order to give more time for
testing before this goes out into a release...

Refs: nodejs#8040
Refs: nodejs#8203
@jasnell
Copy link
Member

jasnell commented Aug 22, 2016

See: #8225

@Trott
Copy link
Member Author

Trott commented Aug 22, 2016

Closing as revert is imminent.... :-/

@Trott Trott closed this Aug 22, 2016
@not-an-aardvark
Copy link
Contributor

not-an-aardvark commented Aug 23, 2016

Minor update: I tried running the test on a Ubuntu 16.04 VM (was using OSX before), but was still unable to reproduce the bug locally.

What are the specs of the server that runs the tests? I'm wondering if I would be able to simulate whatever is causing the test to fail, e.g. by restricting memory usage. (I couldn't reproduce the issue with node --max-old-space-size=10, but there might be other things to try.)

@Trott
Copy link
Member Author

Trott commented Aug 23, 2016

@not-an-aardvark Maybe this is useful? Build details for the host (ansible playbook mostly): https://github.com/nodejs/build/tree/master/setup/ubuntu16.04

@Trott
Copy link
Member Author

Trott commented Aug 23, 2016

@not-an-aardvark If I'm reading lshw output correctly, then the host where the test failed here is a 2 CPU (Intel Xeon @ 2.40 GHz) host with 2 Gb of RAM.

jasnell added a commit that referenced this pull request Aug 23, 2016
This reverts commit 0fc5e0d.

Additional testing indicates that there may still be timing issues
with this implementation. Revert in order to give more time for
testing before this goes out into a release...

Refs: #8040
Refs: #8203
PR-URL: #8225
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
@jasnell
Copy link
Member

jasnell commented Aug 23, 2016

Commit is reverted...

@not-an-aardvark
Copy link
Contributor

not-an-aardvark commented Aug 24, 2016

I managed to reproduce the issue on a Linux server that I have access to. I'm still investigating, but I think V8 is doing something very strange here.

When I run this function (almost the same function from the original PR, but with an additional commented line that contains the letter 'A' a bunch of times):

function getTValue(compareFunc) {
  // ... function body etc. etc.
  // see [here](https://github.com/nodejs/node/blob/0fc5e0dcd9c628b62fc9908daf64bc5db8d503c0/test/sequential/test-crypto-timing-safe-equal.js#L48-L84) for the full function body

  // ...

  const standardErr = combinedStd * Math.sqrt(1 / equalLen + 1 / unequalLen);

  //AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

  return (equalMean - unequalMean) / standardErr;
}

...the test fails very reliably. Note that the commented line contains the letter A exactly 100 times, and getTValue.toString().length === 1602.

If I add one more A to the end of the commented line, the test passes reliably.

This seems to be dependent on the function's string length; adding some other extraneous character (e.g. an extra semicolon) to function has the same effect as adding an extra A to the commented line.

My guess would be that V8 is performing some optimization internally depending on the string length of the getTValue function, but to be honest I have basically no clue why adding a commented line is observably affecting this benchmark.

@not-an-aardvark
Copy link
Contributor

I just saw this article about optimizations with function length. It seems like setting the --max-inlined-source-size=0 flag causes the test to pass regardless of the function's length.

@Trott
Copy link
Member Author

Trott commented Aug 24, 2016

My uneducated guess is function inlining is the culprit.

EDIT: Looks like you hit on the same conclusion as I was typing this.

@nodejs/v8

@not-an-aardvark
Copy link
Contributor

I think I understand (extremely vaguely) how inlining works, but I'm still not sure I understand why inlining would cause the two benchmarks in this function to take different amounts of time.

In particular, we should figure out which (if any) of these two statements is accurate:

  1. The inconsistency in the test was caused by function inlining, but crypto.timingSafeEqual is working correctly. It's safe to re-add the feature after fixing the test.
  2. The inconsistency in the test indicated that crypto.timingSafeEqual is sometimes timing-unsafe due to function inlining. We need to fix this to ensure that crypto.timingSafeEqual always works correctly.

@Trott
Copy link
Member Author

Trott commented Aug 24, 2016

@not-an-aardvark Maybe @indutny or @bnoordhuis might have insight (as they're the only people who are on both @nodejs/crypto and @nodejs/v8).

@bnoordhuis
Copy link
Member

The inconsistency in the test was caused by function inlining

That sounds plausible. V8 uses JS function length (including comments) as a cheap signal for its inlining heuristics.

You should be able to confirm that with --trace_opt --trace_deopt --trace_turbo. Drop --trace_deopt if the output gets too noisy.

@not-an-aardvark
Copy link
Contributor

I can confirm that the getTValue function (and a few others) get inlined, but why would this cause the benchmarks to takes different amounts of time? Since both benchmarks run in getTValue and use the same helper functions, it seems like they would be affected equally by the inlining.

@bnoordhuis
Copy link
Member

Inlining is done by the tier 2 compiler (the optimizing compiler), the tier 1 compiler (the baseline compiler) doesn't inline.

The job of the tier 1 compiler is to generate code that collects data for the tier 2 compiler to do its job. The tier 2 compiler doesn't start until the hot part of the program has been executing for a while.

From an outside perspective it makes total run-time somewhat non-deterministic because you can't really know when and where the optimizing compiler is going to kick in.

@not-an-aardvark
Copy link
Contributor

That makes sense, but I'm confused about why we would get benchmarks like the ones that appeared in #8203 (comment). The benchmarks are gathered in alternating order (run one benchmark with equal buffers, then one benchmark with unequal buffers, etc.), so even if inlining happens at some non-deterministic point as the program runs, I don't understand why the values in rawEqualBenches would be uniformly larger than those in rawUnequalBenches.

@not-an-aardvark
Copy link
Contributor

not-an-aardvark commented Aug 25, 2016

For what it's worth, here's the full output of ./node --allow-natives-syntax --trace-inlining test/sequential/test-crypto-timing-safe-equal.js:

Inlining api function 0x37023d5c9861 <JS Function timingSafeEqual (SharedFunctionInfo 0x37023d5c97b9)>
Inlining api function 0x37023d5c9861 <JS Function timingSafeEqual (SharedFunctionInfo 0x37023d5c97b9)>
Inlined strictEqual called from runBenchmark.
Inlining api function 0x1c44efadb0f1 <JS Function hrtime (SharedFunctionInfo 0x1c44efadb049)>
Inlined assertSize called from Buffer.alloc.
Did not inline createUnsafeBuffer called from Buffer.alloc (target not inlineable).
Did not inline fill called from Buffer.alloc (target text too big).
Inlined Buffer.alloc called from getTValue.
Inlined assertSize called from Buffer.alloc.
Did not inline createUnsafeBuffer called from Buffer.alloc (target not inlineable).
Did not inline fill called from Buffer.alloc (target text too big).
Inlined Buffer.alloc called from getTValue.
Inlined assertSize called from Buffer.alloc.
Did not inline createUnsafeBuffer called from Buffer.alloc (target not inlineable).
Did not inline fill called from Buffer.alloc (target text too big).
Inlined Buffer.alloc called from getTValue.
Inlined assertSize called from Buffer.alloc.
Did not inline createUnsafeBuffer called from Buffer.alloc (target not inlineable).
Did not inline fill called from Buffer.alloc (target text too big).
Inlined Buffer.alloc called from getTValue.
Did not inline Array called from getTValue (Constant length outside of valid inlining range.).
Did not inline Array called from getTValue (Constant length outside of valid inlining range.).
Inlining api function 0x37023d5c9861 <JS Function timingSafeEqual (SharedFunctionInfo 0x37023d5c97b9)>
Inlined strictEqual called from runBenchmark.
Inlined runBenchmark called from getTValue.
Inlining api function 0x37023d5c9861 <JS Function timingSafeEqual (SharedFunctionInfo 0x37023d5c97b9)>
Inlined strictEqual called from runBenchmark.
Inlined runBenchmark called from getTValue.
Inlining builtin 0x1c44efaca099 <JS Function sqrt (SharedFunctionInfo 0x1c44efa88b81)>
Inlining builtin 0x1c44efaca1b9 <JS Function pow (SharedFunctionInfo 0x1c44efa5b381)>
Inlining api function 0x3e694f5fc789 <JS Function compare (SharedFunctionInfo 0x3e694f5fc6e1)>
Inlined equals called from unsafeCompare.
Inlined strictEqual called from runBenchmark.
Inlining api function 0x3e694f5fc789 <JS Function compare (SharedFunctionInfo 0x3e694f5fc6e1)>

For some reason, the test seems to pass as a result of including the --trace-inlining flag. (In fact, it even seems to pass when I add an unrelated flag such as --expose-gc, although it fails consistenly when the flag is gone.)

@not-an-aardvark
Copy link
Contributor

Looking at the build history on #8040, it seems like the linux builds only started to fail after %OptimizeFunctionOnNextCall was added. Before that, only various tests on arm were failing.

When we end up getting a consistent build that works on all platforms, we should try to make sure we have a good understanding why these builds failed. I don't know very much about the V8 internals, but it's still unclear to me whether the issue on linux was a test flake or a real security issue that is now masked.

@not-an-aardvark
Copy link
Contributor

Created a new PR at #8304.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crypto Issues and PRs related to the crypto subsystem. test Issues and PRs related to the tests. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants