You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran into a couple of issues with merging spec durations:
The durations from each machine are written out with all the other spec durations, including those that haven't changed in this run, even though the plugin prints out to the console just the durations for a specific run.
The "average" strategy is not always optimal
If you're merging 3 different files where only the middle one has a change in a certain spec duration then it won't actually be the average of the "two" different durations. It's actually a division by 4 not by 2. It can happen because of issue No.1, because it writes all the durations for all specs even for those that haven't changed, so when merging using the merge utility, it will see the same duration for a given spec for all the machines it didn't run on
Sometimes (I would probably say most times), you don't want an average but rather just take the most recent one. Using the average means that the duration approaches the runtime of the spec but it can take a while. Using the most recent one (with a threshold defined) is roughly the same but with one single "jump". That lack of "memory" might be a good thing, if for instance, you added/removed a bunch of tests from a spec file, this would have a significant impact on the runtime and using the average it would take a bunch of updates till it actually reaches a reasonably close runtime.
I'll elaborate on each:
1 - All spec durations are written out.
Here's an example of what the plugin prints to the console:
Even though specs E-G haven't run on machine No.2. This isn't a problem by itself, but this causes an issue when merging the timings from different machines using the average strategy.
2 - Average strategy is not always optimal
2.1 - "Average" is not exactly the average
Simplifying the above example, let's assume we're running our tests on 3 machines, these would be the 3 written cypress-timings-machine-*.json files:
We'll do (33589 + 35123) / 2 which results in 34356, then we'll merge the last one, which would be (34356 + 33589) / 2 and that gives 33972.5 while the real "average" is 34356. Although doesn't look like much, in tests that take 2+ minutes, this could mean a few seconds difference. This can tip the scale on how the tests are distributed across the machines, not by a lot but still.
If we'll calculate the same for C.spec.ts we'll see that it actually is the average of 40123 and 38835. However A.spec.ts, would not be like C but more like B. It's somewhat inconsistent.
2.2 - Taking the most recent duration of a spec
Again, with the above example, if we instead just take the duration of the most recent run, we avoid the average issue altogether. One could also say that the duration of the most recent is closer to the "real" runtime of the spec than the average duration of all runs.
The text was updated successfully, but these errors were encountered:
Hey
I ran into a couple of issues with merging spec durations:
I'll elaborate on each:
1 - All spec durations are written out.
Here's an example of what the plugin prints to the console:
While infact the after it prints this:
You'd see something like this:
Even though specs E-G haven't run on machine No.2. This isn't a problem by itself, but this causes an issue when merging the timings from different machines using the average strategy.
2 - Average strategy is not always optimal
2.1 - "Average" is not exactly the average
Simplifying the above example, let's assume we're running our tests on 3 machines, these would be the 3 written
cypress-timings-machine-*.json
files:cypress-timings-machine-0.json:
cypress-timings-machine-1.json:
cypress-timings-machine-2.json:
If we'll look at spec
B.spec.ts
and merge the durations using the current average codecypress-split/src/timings.js
Lines 102 to 121 in 7a2cf41
We'll do (33589 + 35123) / 2 which results in 34356, then we'll merge the last one, which would be (34356 + 33589) / 2 and that gives 33972.5 while the real "average" is 34356. Although doesn't look like much, in tests that take 2+ minutes, this could mean a few seconds difference. This can tip the scale on how the tests are distributed across the machines, not by a lot but still.
If we'll calculate the same for
C.spec.ts
we'll see that it actually is the average of 40123 and 38835. HoweverA.spec.ts
, would not be likeC
but more likeB
. It's somewhat inconsistent.2.2 - Taking the most recent duration of a spec
Again, with the above example, if we instead just take the duration of the most recent run, we avoid the average issue altogether. One could also say that the duration of the most recent is closer to the "real" runtime of the spec than the average duration of all runs.
The text was updated successfully, but these errors were encountered: