Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[$500] ExpensiMark: large bundle size #31183

Closed
tomekzaw opened this issue Nov 10, 2023 · 46 comments
Closed

[$500] ExpensiMark: large bundle size #31183

tomekzaw opened this issue Nov 10, 2023 · 46 comments
Assignees
Labels
Engineering External Added to denote the issue can be worked on by a contributor NewFeature Something to build that is a new item. Reviewing Has a PR in review Weekly KSv2

Comments

@tomekzaw
Copy link
Contributor

tomekzaw commented Nov 10, 2023

Problem
A simple script that uses ExpensiMark library to parse Markdown into HTML has a bundle size of 336 kB (or 184 kB minified) which is quite big.

Details
The file is large because it contains mappings of all HTML entities (from _.escape) and countries, states, currencies etc. from CONST.jsx.

Solution
Investigate if it's possible to minimize the bundle size.

Steps to reproduce

index.js

import ExpensiMark from "expensify-common/lib/ExpensiMark";

const input = "Hello, *world*!";
const parser = new ExpensiMark();
const output = parser.replace(input);

console.log(JSON.stringify(input));
console.log(JSON.stringify(output));
esbuild index.js --bundle --minify --outfile=out.js

out.js: https://pastebin.swmansion.com/?9798f33833cc33b6#DRgiwxs8ujteamHaGDioSzZVZ83nuWUim3QabJ5PUYed

Issue OwnerCurrent Issue Owner: @Skalakid
Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~0155a63860a646579c
  • Upwork Job ID: 1740457868647858176
  • Last Price Increase: 2024-01-04
  • Automatic offers:
    • fedirjh | Reviewer | 28090032
    • aswin-s | Contributor | 28099676
@roryabraham roryabraham self-assigned this Nov 10, 2023
@roryabraham roryabraham added Monthly KSv2 Engineering NewFeature Something to build that is a new item. labels Nov 10, 2023
Copy link

melvin-bot bot commented Nov 10, 2023

@melvin-bot melvin-bot bot added Weekly KSv2 and removed Monthly KSv2 labels Nov 10, 2023
@roryabraham
Copy link
Contributor

Co-assigning myself to supervise, but this will be worked on by Michał Skałka from SWM, who will be invoicing us separately

@roryabraham
Copy link
Contributor

Maybe the first and easiest steps here are to:

  1. Port Expensimark to typescript and stop using underscore – that would likely help reduce the file size
  2. Split up the big CONST export and separately export/import only the pieces that we need in Expensimark

@Skalakid
Copy link
Contributor

Hi I’m Michał from Software Mansion, an expert agency, and I’d like to work on this issue

@roryabraham
Copy link
Contributor

No update from me, but @Skalakid keep me posted if you're starting to work on this

@melvin-bot melvin-bot bot removed the Overdue label Nov 21, 2023
@melvin-bot melvin-bot bot added the Overdue label Nov 29, 2023
@roryabraham
Copy link
Contributor

not urgent, going to make @Skalakid the issue owner here

@melvin-bot melvin-bot bot removed the Overdue label Nov 29, 2023
@melvin-bot melvin-bot bot removed the Weekly KSv2 label Dec 25, 2023
Copy link

melvin-bot bot commented Dec 25, 2023

This issue has not been updated in over 15 days. @Skalakid, @roryabraham eroding to Monthly issue.

P.S. Is everyone reading this sure this is really a near-term priority? Be brave: if you disagree, go ahead and close it out. If someone disagrees, they'll reopen it, and if they don't: one less thing to do!

@melvin-bot melvin-bot bot added the Monthly KSv2 label Dec 25, 2023
@roryabraham
Copy link
Contributor

Moving this back to weekly. We are hoping to launch this feature in January, so we should start making moves here too. @Skalakid is going to be focused on inline video, so we should find someone else to tag in on this.

@roryabraham roryabraham added Weekly KSv2 and removed Monthly KSv2 labels Dec 27, 2023
@tomekzaw
Copy link
Contributor Author

tomekzaw commented Dec 28, 2023

Would be great to have this for V1 but currently JS bundle of Live Markdown parser is 188 kB so it's not a blocker I guess.

@roryabraham
Copy link
Contributor

I'm going to make this external and accept proposals for open-source contributors to work on this.

@roryabraham roryabraham added the External Added to denote the issue can be worked on by a contributor label Dec 28, 2023
@kidroca
Copy link
Contributor

kidroca commented Jan 4, 2024

I agree with the insights shared thus far. Here are my thoughts on the key issues:

The Predominant Impact of html-entities (Approximately 50%)

Given that ExpensiMark extensively utilizes the html-entities library through Str for encoding and decoding, the room for optimization might be limited. We're already on the latest version of html-entities. The alternatives, such as replacing or redeveloping it, pose significant risks and complexities, which may not justify the potential gains in bundle size reduction.

Potential Extraction from CONST (Roughly 7%)

Extracting elements out of CONST seems relatively straightforward. However, it's worth noting that the impact here is minimal – we're looking at a size reduction of about 5-7%. While it's a step in the right direction, it won't be a game-changer for the overall bundle size.

Addressing underscore Usage (Approximately 12%)

The suggestion to eliminate underscore usage from the file is viable, considering its limited use. However, an alternative approach could be more beneficial:

If the projects importing expensify-common use the same or a compatible version of underscore, there would be no additional burden on the bundle size. Rather than removing underscore, ensuring that dependent projects are aligned with the latest version could be a more effective strategy. This approach maintains functionality while optimizing for size.

Minification Considerations

Currently, the expensify-common project is distributed as pure source code, not as a pre-bundled library. This approach delegates the responsibilities of bundling, tree shaking, and minification to the consumer project. A potential area for exploration and improvement could be to update expensify-common to be distributed as a bundled library. This change could lead to more uniform consumption and potentially more efficient optimization processes across different projects.

@roryabraham
Copy link
Contributor

@kidroca I like the idea to just align the underscore version used in expensify-common and E/App. That seems like a very easy way to get a 12% bundle size reduction without much effort.

@melvin-bot melvin-bot bot removed the Help Wanted Apply this label when an issue is open to proposals by contributors label Jan 9, 2024
Copy link

melvin-bot bot commented Jan 9, 2024

📣 @fedirjh 🎉 An offer has been automatically sent to your Upwork account for the Reviewer role 🎉 Thanks for contributing to the Expensify app!

Offer link
Upwork job

@kidroca
Copy link
Contributor

kidroca commented Jan 9, 2024

@kidroca I like the idea to just align the underscore version used in expensify-common and E/App. That seems like a very easy way to get a 12% bundle size reduction without much effort.

It appears that E/App is already aligned with expensify-common in terms of the underscore version used:

  • E/App employs ^1.13.1, which, due to the caret (^) symbol, automatically resolves to the latest patch version within the minor version 1.13. As of now, this is 1.13.6.
    (You can see which version of underscore a project is using by running npm why underscore)
  • expensify-common is explicitly using version 1.13.6.

Given this setup, for broader compatibility with other projects, we could consider moving underscore to peerDependencies in expensify-common. This move would allow us to declare a wider range of acceptable versions, effectively bridging any version gaps among different projects.

Alternatively, if placing underscore under peerDependencies poses challenges, we could simply adjust the version specification in expensify-common to use the caret (^) prefix. This change would ensure that any patch within the current minor version (1.13.x) is compatible, thus maintaining flexibility for patch updates without necessitating major version changes. Such a strategy would align with semver principles and could facilitate easier maintenance and compatibility across projects.

@aswin-s
Copy link
Contributor

aswin-s commented Jan 9, 2024

@roryabraham Also not sure why we overlooked the removal of String.prototype.replaceAll. That amounts to 30KB (16% savings). Seems like that's more relevant than underscore since it is being shared with other packages.

https://bundlephobia.com/package/string.prototype.replaceall@1.0.8

@kidroca
Copy link
Contributor

kidroca commented Jan 9, 2024

I agree with @aswin-s's suggestion regarding the removal of the String.prototype.replaceAll polyfill. This change is straightforward and indeed offers a significant reduction in the bundle size

Removing the polyfill impacts the backward compatibility and potential technical debt for legacy systems. It might be beneficial to conduct a quick survey or analysis to determine how many such legacy projects are dependent on expensify-common and their current Node.js versions.

I am generally in favor of removing the polyfill, as it primarily affects projects running on very old versions of Node.js (older than 15), which are likely to be a minority if any

@roryabraham
Copy link
Contributor

@aswin-s sorry about that, I do agree with removing the replaceAll polyfill.

@aswin-s I'll co-assign this issue to you to open a PR to remove that polyfill, thanks.

Copy link

melvin-bot bot commented Jan 14, 2024

📣 @aswin-s 🎉 An offer has been automatically sent to your Upwork account for the Contributor role 🎉 Thanks for contributing to the Expensify app!

Offer link
Upwork job
Please accept the offer and leave a comment on the Github issue letting us know when we can expect a PR to be ready for review 🧑‍💻
Keep in mind: Code of Conduct | Contributing 📖

@aswin-s
Copy link
Contributor

aswin-s commented Jan 19, 2024

@roryabraham Not sure if this is the right place to ask, but since it also impacts the bundle size of ExpensiMark, I'll ask anyway.

We have a really large regex to match valid TLDs. Although it doesn’t significantly impact browser or desktop environments, it takes a considerable amount of time to execute on mobile devices. On Android devices, the TLD matching, which is part of markdown parsing, takes around 3 seconds to execute, thereby blocking the UI. The execution time increases exponentially with size of the matching text. I was wondering if there is a specific reason for having such stringent TLD matching, or if it's just been retained for historical reasons. Simplifying this to a shorter regex pattern that checks the validity of the TLD pattern, rather than checking explicit domains, would reduce the bundle size by around 10KB and significantly increase regex execution speed. I would be interested in hearing your thoughts on this.

Came across this while I was investigating performance issues in parsing markdown content in this issue.

@roryabraham
Copy link
Contributor

roryabraham commented Jan 19, 2024

@abekkala pulling you back in to help manage payments here. @kidroca invoices us separately but I think we may need to create and upwork job for @aswin-s

Edit: already done: #31183 (comment)

@aswin-s
Copy link
Contributor

aswin-s commented Jan 24, 2024

PR was merged on Jan 15th. This one is ready for payment.

@abekkala
Copy link
Contributor

@roryabraham I want to make sure I'm looking at the Correct PR.
The one above has not merged to production yet, correct?
The payment countdown starts after it's merged to production and there are no regressions.

@aswin-s The PR above Expensify/expensify-common#632 is not showing merged to prod yet.

once that happens the payment title for this GH will auto-update with the payment date

@aswin-s
Copy link
Contributor

aswin-s commented Jan 26, 2024

@abekkala PR is merged to master in expensify-common repository. Melvin seems to be tracking PRs from Expensify/App repository only.

@roryabraham Could you please confirm?

@roryabraham
Copy link
Contributor

Ah, sorry for the confusion here. @aswin-s can you please open an E/App PR to upgrade the expensify-common version we're using (otherwise the PR won't take effect). Then once that hit production we can issue payment.

@aswin-s aswin-s mentioned this issue Jan 28, 2024
50 tasks
@melvin-bot melvin-bot bot added Reviewing Has a PR in review Weekly KSv2 and removed Weekly KSv2 labels Jan 28, 2024
@fedirjh
Copy link
Contributor

fedirjh commented Jan 30, 2024

This PR got superseded by #34801. Expensify-commons has got upgraded to a more recent commit than the one mentioned in this PR.

It seems that expensify-common was upgraded in #34801.

@aswin-s
Copy link
Contributor

aswin-s commented Feb 7, 2024

Ah, sorry for the confusion here. @aswin-s can you please open an E/App PR to upgrade the expensify-common version we're using (otherwise the PR won't take effect). Then once that hit production we can issue payment.

@abekkala @roryabraham Above PR has already hit PROD last week.

@roryabraham
Copy link
Contributor

@abekkala can we please issue payment of $500 to @aswin-s for Expensify/expensify-common#632 ?

@abekkala
Copy link
Contributor

abekkala commented Feb 8, 2024

@roryabraham yes, I can send an upwork offer to @aswin-s
Was there someone who did the PR review (that needs payment as well)? or was that done internally?

@roryabraham
Copy link
Contributor

PRs for this issue were reviewed internally

@abekkala
Copy link
Contributor

@aswin-s payment sent and contract ended - thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Engineering External Added to denote the issue can be worked on by a contributor NewFeature Something to build that is a new item. Reviewing Has a PR in review Weekly KSv2
Projects
Status: Done
Development

No branches or pull requests

8 participants