Decide whether wasm binsize benchmark should include data #4067

sffc · 2023-09-21T21:16:39Z

The binsize benchmark shows a regression in the datetime code size between these two dates. It was offline during that time, so it's not entirely clear what the culprit commits were. I imagine that a lot of this is due to calendrical calculations, but we should know for sure so that we can better understand and plan out mitigations.

First step would be to bisect the time between these commits (cfe1f71 and ddb1a7b) and find the commits that most contributed to the spike.

Also, these tests use TypedZonedDateTimeFormatter::<Gregorian>, so in theory they shouldn't be impacted by the calendarical calculation code.

The text was updated successfully, but these errors were encountered:

sffc · 2023-10-05T17:35:50Z

Note: the regressions are in work_log+opt.wasm and tui+opt.wasm

sffc · 2023-10-05T17:37:50Z

First step is to bisect.

Once bisect is finished, we should add this back to the triage list.

sffc · 2023-10-10T01:05:19Z

Command line: rm -r wasmpkg; cargo make wasm-opt; ll wasmpkg/wasm-opt/work_log+opt.wasm; ll wasmpkg/wasm-opt/tui+opt.wasm

Some commits failed to build. Some commits around 2023-02 failed with:

error: failed to run custom build command for `sys-info v0.9.1`

Caused by:
  process didn't exit successfully: `/.../icu4x/target/release-opt-size/build/sys-info-62b5cf4053fe346f/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at 'unsupported system: unknown', /.../.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/sys-info-0.9.1/build.rs:38:14

Other commits from 2023-04 to 2023-08, starting with ca28250 and fixed in c8c9126, failed with:

   Compiling criterion v0.4.0
error: Rayon cannot be used when targeting wasi32. Try disabling default features.
  --> /.../.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/criterion-0.4.0/src/lib.rs:31:1
   |
31 | compile_error!("Rayon cannot be used when targeting wasi32. Try disabling default features.");
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Bisection results from the commits that built out of the box in reverse chronological order:

3810109

-rw-r--r-- 1 sffc primarygroup 378730 Oct  9 17:24 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3183827 Oct  9 17:24 wasmpkg/wasm-opt/tui+opt.wasm

13b6b4e

-rw-r--r-- 1 sffc primarygroup 358905 Oct  9 17:40 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3151534 Oct  9 17:40 wasmpkg/wasm-opt/tui+opt.wasm

ea1ba9f

-rw-r--r-- 1 sffc primarygroup 356957 Oct  9 17:57 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3154623 Oct  9 17:57 wasmpkg/wasm-opt/tui+opt.wasm

c8c9126

-rw-r--r-- 1 sffc primarygroup 366215 Oct  9 18:03 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 3165598 Oct  9 18:03 wasmpkg/wasm-opt/tui+opt.wasm

(long period of silence due to bad lockfile)

eb79e5f

-rw-r--r-- 1 sffc primarygroup 119364 Oct  9 17:53 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 513517 Oct  9 17:53 wasmpkg/wasm-opt/tui+opt.wasm

f6bfe8b

-rw-r--r-- 1 sffc primarygroup 119418 Oct  9 17:47 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 520134 Oct  9 17:47 wasmpkg/wasm-opt/tui+opt.wasm

cfe1f71

-rw-r--r-- 1 sffc primarygroup 118988 Oct  9 17:28 wasmpkg/wasm-opt/work_log+opt.wasm
-rw-r--r-- 1 sffc primarygroup 519593 Oct  9 17:28 wasmpkg/wasm-opt/tui+opt.wasm

So the regression was during the summer, sometime between 2023-04 and 2023-08.

sffc · 2023-10-10T01:23:00Z

I will try using a proxy metric, the linux example bin size with release-opt-size:

cargo +nightly-2022-12-26 build --profile release-opt-size --examples --workspace --features serde --exclude icu_datagen --exclude icu_capi; ll target/release-opt-size/examples/work_log; ll target/release-opt-size/examples/tui

c8c9126

-rwxr-xr-x 2 sffc primarygroup 3521488 Oct  9 18:07 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 6749624 Oct  9 18:07 target/release-opt-size/examples/tui*

4ed5dbd

-rwxr-xr-x 2 sffc primarygroup 3529376 Oct  9 18:13 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 6754240 Oct  9 18:13 target/release-opt-size/examples/tui*

(in this period, there is an error while building examples; do not know how that could have landed, perhaps a feature resolution issue)

7f10b15

-rwxr-xr-x 2 sffc primarygroup 1960928 Oct  9 18:20 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2409400 Oct  9 18:20 target/release-opt-size/examples/tui*

004581f

-rwxr-xr-x 2 sffc primarygroup 1949864 Oct  9 18:20 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2396512 Oct  9 18:20 target/release-opt-size/examples/tui*

09ad8b7

-rwxr-xr-x 2 sffc primarygroup 1945776 Oct  9 18:18 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2388376 Oct  9 18:18 target/release-opt-size/examples/tui*

244a499

-rwxr-xr-x 2 sffc primarygroup 1945776 Oct  9 18:14 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2388376 Oct  9 18:14 target/release-opt-size/examples/tui*

d8cc2f3

-rwxr-xr-x 2 sffc primarygroup 1957376 Oct  9 18:12 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2387232 Oct  9 18:12 target/release-opt-size/examples/tui*

eb79e5f

-rwxr-xr-x 2 sffc primarygroup 1963344 Oct  9 18:10 target/release-opt-size/examples/work_log*
-rwxr-xr-x 2 sffc primarygroup 2398008 Oct  9 18:10 target/release-opt-size/examples/tui*

sffc · 2023-10-10T01:31:59Z

I did some manual work to get the intermediate broken commits to build. Doing so, I found that for work_log, there are two commits that together account for almost the entire regression:

7e2733d Baked data for icu_datetime (Baked data for icu_datetime #3591)
7b8ad79 Add recommended locale set and expand regions in datagen (Add recommended locale set options and expand regions in datagen #3586)

In other words, the examples now use compiled data, whereas before they used buffer provider data from a buffer that was not included as part of the measurement. Mystery solved.

Now, the follow-up question is, do we want to continue measuring the wasm binsize with data included, or do we want to measure it without data? If we measure it with data, we're measuring something that "actually works", whereas if we measure it without data, we can detect regressions in code size at a much more granular level.

robertbastian · 2023-10-10T13:17:12Z

Ah of course.

If we call these code size benchmarks we should use empty data. We could have a data size benchmark v2 that is the difference between all data and no data.

robertbastian · 2023-10-10T13:40:01Z

Actually for singleton keys there's no such thing as "empty data", and removing all data will also remove fallback logic (and data). I think including just und could be a good middle ground.

robertbastian · 2023-10-10T14:14:40Z

Without the region-select option it's tricky to include exactly one data struct.

sffc · 2023-10-19T17:47:46Z

@sffc to investigate and implement fix if necessary.

sffc added good first issue Good for newcomers C-test-infra Component: Integration test infrastructure S-small Size: One afternoon (small bug fix or enhancement) labels Sep 21, 2023

sffc mentioned this issue Sep 21, 2023

Release ICU4X 1.3 #4065

Open

18 tasks

sffc assigned sffc, Manishearth and robertbastian Oct 5, 2023

sffc modified the milestones: 1.x Priority ⟨P2⟩, 1.4 Blocking ⟨P1⟩ Oct 5, 2023

sffc unassigned Manishearth and robertbastian Oct 10, 2023

sffc removed their assignment Oct 10, 2023

sffc added discuss Discuss at a future ICU4X-SC meeting and removed good first issue Good for newcomers labels Oct 10, 2023

sffc changed the title ~~Investigate spike in datetime binsize between 2022-12-29 and 2023-08-21~~ Decide whether wasm binsize should include data Oct 10, 2023

sffc changed the title ~~Decide whether wasm binsize should include data~~ Decide whether wasm binsize benchmark should include data Oct 10, 2023

sffc removed this from the 1.4 Blocking ⟨P1⟩ milestone Oct 10, 2023

sffc self-assigned this Oct 19, 2023

sffc added this to the 1.4 Blocking ⟨P1⟩ milestone Oct 19, 2023

sffc removed the discuss Discuss at a future ICU4X-SC meeting label Oct 19, 2023

sffc modified the milestones: 1.4 Blocking ⟨P1⟩, 1.5 Blocking ⟨P1⟩ Nov 14, 2023

sffc modified the milestones: 1.5 Blocking ⟨P1⟩, Backlog ⟨P4⟩ Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decide whether wasm binsize benchmark should include data #4067

Decide whether wasm binsize benchmark should include data #4067

sffc commented Sep 21, 2023

sffc commented Oct 5, 2023

sffc commented Oct 5, 2023

sffc commented Oct 10, 2023

sffc commented Oct 10, 2023

sffc commented Oct 10, 2023

robertbastian commented Oct 10, 2023

robertbastian commented Oct 10, 2023 •

edited

Loading

robertbastian commented Oct 10, 2023

sffc commented Oct 19, 2023

Decide whether wasm binsize benchmark should include data #4067

Decide whether wasm binsize benchmark should include data #4067

Comments

sffc commented Sep 21, 2023

sffc commented Oct 5, 2023

sffc commented Oct 5, 2023

sffc commented Oct 10, 2023

sffc commented Oct 10, 2023

sffc commented Oct 10, 2023

robertbastian commented Oct 10, 2023

robertbastian commented Oct 10, 2023 • edited Loading

robertbastian commented Oct 10, 2023

sffc commented Oct 19, 2023

robertbastian commented Oct 10, 2023 •

edited

Loading