Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort duplicate modules by duplicate size #3929

Closed
steverep opened this issue Oct 17, 2023 · 19 comments
Closed

Sort duplicate modules by duplicate size #3929

steverep opened this issue Oct 17, 2023 · 19 comments
Labels
feature New feature or request

Comments

@steverep
Copy link

It would be nice to be able to sort the list of duplicate modules by:

  1. Number of duplicates (i.e. number of chunks that contain it)
  2. Duplicates size (duplicateSize = moduleSize*(nChunks - 1))

The second is a way to see the modules doing the most damage to bundle size via duplicates.

@vio
Copy link
Member

vio commented Oct 18, 2023

@steverep Thank you for the feedback! We were considering showing the number of chunks, the data is already there. Having the duplicate size is even better, wondering how it is going to look with the module paths being in general long.

@vio vio added the feature New feature or request label Oct 18, 2023
@steverep
Copy link
Author

wondering how it is going to look with the module paths being in general long.

You mean the screen real estate is an issue? For the tables comparing two jobs, I think just having a picker to choose the metric to compare would suffice. Right now it defaults to the individual module size, but let me choose to switch to number of containing chunks or duplicated size (total bundle size contribution).

BTW, it looks like the size reported for modules does not reflect tree shaking. Is that correct?

@vio
Copy link
Member

vio commented Oct 25, 2023

You mean the screen real estate is an issue?

Yes, the module paths can be pretty long.

For the tables comparing two jobs, I think just having a picker to choose the metric to compare would suffice. Right now it defaults to the individual module size, but let me choose to switch to number of containing chunks or duplicated size (total bundle size contribution).

👍

BTW, it looks like the size reported for modules does not reflect tree shaking. Is that correct?

Yes, webpack reports the size of the module before any production optimization (minification, tree shaking).

@steverep
Copy link
Author

Yes, webpack reports the size of the module before any production optimization (minification, tree shaking).

That's unfortunate. 🤷🏻‍♂️

@vio
Copy link
Member

vio commented Oct 30, 2023

for reference, we are considering to pick up the production-correlated size using the results from webpack-bundle-analyzer or by using another method ;)

@danielbeardsley
Copy link
Contributor

When it comes to duplicates bundle-stats isn't helpful because it doesn't seem to report much for changes in duplication.

Here's an example report where there were significant changes in duplication (apparently causing an additional 40KB of js), but we don't get to see which modules are now duplicated that weren't before:

image

@danielbeardsley
Copy link
Contributor

I've opened a pull that helps with this: #4069

@vio
Copy link
Member

vio commented Dec 4, 2023

@danielbeardsley thank you for the feedback!

I spent some time last week trying to show more data on the same view. Started with the chunk count since that seemed to be easier, the result did not look good, but I ended up with a couple of conclusions:

  • showing one more column makes the view too complex, no point in trying to cram in the total size (value, absolute delta, relative delta)
  • @steverep recommendation of a selector for the metric is probably the most convenient UX
  • the total size (module instances x module size) is more useful than the module size, we should consider making it the default in the next major version

I've opened a pull that helps with this: #4069

Thank you, I really appreciate it! Sorry for the slow response, I've been offline the last few days.

I think introducing a metric selector will allow us to structure the view in logical groups, reuse the filters and table, and show other metrics like chunk count.

shapes at 23-12-04 23 56 50

This method is slightly different than the one you implemented in the PR, but since it involves manipulating the same data and this is something you already thought about it, are you interested in changing your PR to:

  1. add the metrics selector
  2. update module rows based on the selector state
  3. update table columns based on the selector state
  4. add the new selector to the URL
  5. add the total module size to the side panel

Happy to help if you have any questions or need assistance!

@danielbeardsley
Copy link
Contributor

are you interested in changing your PR to: add the metrics selector

We can, but I feel like the table is pretty centered around bytes and byte totals, so it may be somewhat odd to represent a non-byte quantity. I may try it.

But I agree that "the total size (module instances x module size) is more useful than the module size" because I think that's the value that ends up being totalled for the overall Bundle Size stat.

Also, I don't see much use in comparing quantity of duplicates in the table because we already have a way to filter to just those that are duplicated and with this feature we can now see the resultant total size.

@steverep
Copy link
Author

Also, I don't see much use in comparing quantity of duplicates in the table because we already have a way to filter to just those that are duplicated and with this feature we can now see the resultant total size.

I certainly agree that byte size is the more useful measure, but I disagree that looking at quantity of duplicates has no use. In our bundle I've found modules repeated 30-50 times. Sometimes they are very small, but that still clearly points to optimizations that can be made. Sorting by number of duplicates should also be an option.

@vio
Copy link
Member

vio commented Dec 27, 2023

We can, but I feel like the table is pretty centered around bytes and byte totals, so it may be somewhat odd to represent a non-byte quantity. I may try it.

@danielbeardsley yes, the MetricsTable relies on the value field atm, but i think we can extend it. In the last week, I refactored the BundleModules component to make it easier to extend and I'm investigating possible ways to compare and show other data. I will post the findings here ;)

@steverep
Copy link
Author

Here's a PR result that I think is a good large case study for this:

https://github.com/home-assistant/frontend/pull/16506/checks?check_run_id=19990121982

Basically, all this PR does is adjust Babel settings for reduced transpilation and polyfills, so the vast majority of modules see a decrease in size. However, the bundle size went up by 2.6%, seemingly because of increased duplication. RelativeCI is not much help in narrowing down there the increases are coming from. The goal should be to make that clear with little effort.

@danielbeardsley
Copy link
Contributor

I rebased the changes on master (which had converted a few files to typescript).

I tried going in the direction of a "Metric Selector" and opted not to at this point; mainly because creating a Dropdown that was single-select (instead of multi-select like the filters) was looking like a lot of work. However, I did change the implementation to do the mutation in a selector; it's not a perfect fit, but it felt better than doing it ad-hoc. Adding other selector / mutation options (select duplicate count) is pretty easy once there's a UI to represent the choice.

I hope we can merge and then iterate.

@vio
Copy link
Member

vio commented Jan 11, 2024

@danielbeardsley thank you, good idea to use the selector!

I took some time to extract the button component, and I prepared a demo of a possible selector for module metric type:
https://github.com/relative-ci/bundle-stats/pull/4154/files#diff-9a8f41b56956d654ff83cdcb2ef7f5c134eda18995e8a1075f73558911b8c72bR183


sorry for having to do a rebase, but wanted to add TS to make it easier to update

@danielbeardsley
Copy link
Contributor

I took some time to extract the button component

I've incorporated your change into pull #4069

@vio
Copy link
Member

vio commented Feb 20, 2024

@danielbeardsley changes were merged and released as 4.12.0-beta.0.

@vio
Copy link
Member

vio commented Feb 29, 2024

just released 4.12.0 and includes @danielbeardsley changes plus an update to save the selected value into the URL like the other filters.

thank you for the PR @danielbeardsley!

Will close this issue as resolved, but feel free to open another one if you have any feedback. For example, i think it will be useful to explicitly show both values in the side panel

@vio vio closed this as completed Feb 29, 2024
@steverep
Copy link
Author

steverep commented Mar 2, 2024

@vio I just had a chance to play with the changes. Yes, I think it's almost there but a few necessary tweaks stand out for me:

  • Per the OP, the most useful way to sort results would be by duplicated size, i.e. moduleSize*(nChunks - 1), as opposed to total module size, i.e. moduleSize*nChunks. In other words, show the potential decrease. A 20 KiB module appearing 20 times has a potential savings of 380 KiB, whereas a 200 KiB module with 2 copies can only save 200 KiB. It's a small difference in most cases but matters for larger modules.
  • Also per the OP, still need to see number of duplicates somewhere. The only way to do this currently is to click on the module, and then manually count the chunks, which is obviously tedious. As you pointed out, that details view is a perfect place to just show everything: individual size, duplicated size, total size, and number of copies.
  • Clicking the links for duplicated code or modules should have the duplicated size selected by default. It's confusing otherwise.

I opened a new issue for the 3rd bullet, but the other two would just be a repeat of the OP here so I'd suggest reopening instead.

@vio
Copy link
Member

vio commented Mar 4, 2024

@steverep thank you for the feedback, i think they are all very useful additions. I created new tickets for 1-2 and added them along with #4296 to the todo list ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants