-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decide whether wasm binsize benchmark should include data #4067
Comments
Note: the regressions are in |
First step is to bisect. Once bisect is finished, we should add this back to the triage list. |
Command line: Some commits failed to build. Some commits around 2023-02 failed with:
Other commits from 2023-04 to 2023-08, starting with ca28250 and fixed in c8c9126, failed with:
Bisection results from the commits that built out of the box in reverse chronological order:
(long period of silence due to bad lockfile)
So the regression was during the summer, sometime between 2023-04 and 2023-08. |
I will try using a proxy metric, the linux example bin size with release-opt-size:
(in this period, there is an error while building examples; do not know how that could have landed, perhaps a feature resolution issue)
|
I did some manual work to get the intermediate broken commits to build. Doing so, I found that for
In other words, the examples now use compiled data, whereas before they used buffer provider data from a buffer that was not included as part of the measurement. Mystery solved. Now, the follow-up question is, do we want to continue measuring the wasm binsize with data included, or do we want to measure it without data? If we measure it with data, we're measuring something that "actually works", whereas if we measure it without data, we can detect regressions in code size at a much more granular level. |
Ah of course. If we call these code size benchmarks we should use empty data. We could have a data size benchmark v2 that is the difference between all data and no data. |
Actually for singleton keys there's no such thing as "empty data", and removing all data will also remove fallback logic (and data). I think including just |
Without the region-select option it's tricky to include exactly one data struct. |
@sffc to investigate and implement fix if necessary. |
The binsize benchmark shows a regression in the datetime code size between these two dates. It was offline during that time, so it's not entirely clear what the culprit commits were. I imagine that a lot of this is due to calendrical calculations, but we should know for sure so that we can better understand and plan out mitigations.
First step would be to bisect the time between these commits (cfe1f71 and ddb1a7b) and find the commits that most contributed to the spike.
Also, these tests use
TypedZonedDateTimeFormatter::<Gregorian>
, so in theory they shouldn't be impacted by the calendarical calculation code.The text was updated successfully, but these errors were encountered: