"modular go fast" - Reducing task times in modular projects - Umbrella Issue ☂️ #62

threepointone · 2020-08-31T19:01:49Z

This is an umbrella task for all things related to reducing task times in modular projects.

As modular projects grow (as they should), because we do centralised tasks for build/start/test/etc, we will hit bottlenecks in being able to develop and deploy quickly. While this shouldn't affect daily development per se, it'll start affecting productivity as a whole. Some examples:

a one line change in one package, could take a very long time to get to production because it's triggered a whole test/build cycle in CI/whatever build infrastructure you're using.
a simple bugfix will have a very long turnaround time to get to production.
local development will run only some tests, but in CI it might take a very long time to run all tests, meaning a developer will have to wait for a long time to verify whether their code has broken anything.
modular start might take a long time to warm up, which isn't nice.
and so on.

(note: We should make a comprehensive list of pain points; the solutions won't be super general, so we should make sure we've looked at every possible pain point.)

(note: this issue is not about runtime performance of react applications, though we should probably make an umbrella task for that too)

Possible solutions and strategies:

caching third party dependencies, so it doesn't have to be pulled down on every build (existing work: https://circleci.com/docs/2.0/caching/, https://github.com/actions/cache)
webpack 5's module federation for builds https://webpack.js.org/concepts/module-federation/ We can split the build into pieces (and stitch them up manually, if at all) with this feature; it'll be key that we do this automatically, and without exposing the internals to consumers, or else it'll be hell to unwind later. Things to consider: deduplicating dependencies, verifying module graphs, etc.
webpack 5's persistent caching https://github.com/webpack/changelog-v5/blob/master/guides/persistent-caching.md Again, it'll be key to do this without exposing any internals to the user.
configure jest to only run affected tests on builds (https://jestjs.io/docs/en/cli#--changedsince, https://jestjs.io/docs/en/cli#--findrelatedtests-spaceseparatedlistofsourcefiles) A thing to note here is that we should be able to generate complete coverage reports even if we only run a few of them during. (another precedent in java land - https://github.com/jpmorganchase/sandboni-core)
Use feature flags for rapid release cycles: we should be able to build and ship features rapidly, and be able to turn them on/off as we desire. (https://martinfowler.com/articles/feature-toggles.html, and my own writeup https://gist.github.com/threepointone/2c2fae0622681284410ec9edcc6acf9e)
lint/prettier only on changed files (copy react's yarn linc, basically): In the modular repo, we've setup linting only for changed files when we do commits, but not as a standalone command. We can copy react's linc command. We should also ship the commit behaviour and commands in generated repositories.
Incremental builds: This is fairly nascent in production world for the javascript ecosystem; usually because solutions are tightly bound to serving infrastructure and so on. 'Big' companies like fb/google have in-house solutions for the same. There's an opportunity here to start work on a project that's designed for things like from the start. (I hear whispers of parcel 2 also working on similar goals)

Please feel free to add more to this list in the replies, and/or feedback. I'll keep updating this list based on so. If you'd like to start work on any of these, please file a separate issue and link it back here.

The text was updated successfully, but these errors were encountered:

sebinsua · 2020-09-15T22:05:19Z

Build Performance

I'm going to try my best to explain 'the problem' and possible approaches in this comment. Apologies if I'm re-explaining something that has already been said -- there's a lot of interconnection in my thoughts and I need to get the whole of it out in the open for criticism.

The problem

The basic problem is that while small projects often have respectable webpack build times, larger projects frequently end up with build times that are unbearably bad on CI and sometimes affect local DX.

Part of the issue is that, within the industry that we work (finance), we're often creating applications that are larger than those found in other industries. Firstly, these are web apps and not web sites, and even though the UI that is visible is quite often small, the business logic and functionality is quite often complicated due to regulations or user needs (each application could be 6+ months of a 5-person team's work). Secondly, and more importantly, we're often asked to build applications that aggregate other applications together into something a bit like an operating system, tiling window manager or dashboard. Given these two points, and the possibility that 500+ applications might end up being accessed from the same interface, it's very easy to end up with webpack builds that are either impossible or which massively slow down a team's ability to get work done in a timely manner.

This makes us heavy consumers of CI/compute infrastructure which sometimes can't be procured when our demands are too high. What tends to end up happening is that team's learn to develop and deploy their applications completely independently so they can have their own pipelines with their own metrics, and ensure that their build times don't get in the way of them doing work. As discussed previously this causes horrible difficulty integrating and testing, significant wheel reinvention, bloated bundle sizes, and eventual terminal lock-in to specific versions of libraries (that could have bugs, vulnerabilities or performance deficiencies compared to what is state-of-the-art).

Another way that we are different is that we tend to not use traditional routers. Elsewhere in the industry it is common for developers to use libraries like React Router to code-split an application at the URL boundary, but here applications tend to mainly exist against one URL and be flexibly assembled from separate views into a layout.

The source code that describes this might look like:

import React, { lazy } from 'react';
import { Layout } from 'layout';

const viewsMap = {
  '@modular-app/view-0001': lazy(() => import('@modular-app/view-0001')),
  '@modular-app/view-0002': lazy(() => import('@modular-app/view-0002')),
  '@modular-app/view-0003': lazy(() => import('@modular-app/view-0003')),
  '@modular-app/view-0004': lazy(() => import('@modular-app/view-0004')),
  '@modular-app/view-0005': lazy(() => import('@modular-app/view-0005')),
  // ... many more views ... 
  //
  // If each application had a 2-minute long build time, by 
  // combining every application into one the build time
  // will increase to the point that either webpack fails
  // to build due to an OOM crash, or it takes so long
  // that engineers aren’t able to get their PRs merged
  // in a timely manner. 
  // 
  // ... many more views ...
  '@modular-app/view-0997': lazy(() => import('@modular-app/view-0997')),
  '@modular-app/view-0998': lazy(() => import('@modular-app/view-0998')),
  '@modular-app/view-0999': lazy(() => import('@modular-app/view-0999')),
  '@modular-app/view-1000': lazy(() => import('@modular-app/view-1000')),
}

export default function App({ visibleViews = {} }) {
  return (
    <Layout>
    {Object.fromEntries(visibleViews).map(
      ([viewName, props]) => { 
        const View = viewsMap[viewName];
        return <View key={viewName} {...props} />;
      }
    )}
    </Layout>
  );
}

Effectively, the problem with bundling the source code above, is that while code-splitting allows you to only pay the runtime cost of the views you load, you still have to pay the build-time cost of all of the views pointed to by the page. This eventually stops being viable.

The impact of not solving the problem

Given the scale that we operate at and the type of applications we tend to create, Modular will not work for us in an acceptable way unless we solve this problem. It is a critical issue that must be resolved.

The risk isn’t just that builds would take too long. The risk is that we would cause out-of-memory (OOM) crashes. This was confirmed in another repository testing super-large webpack builds.

If all applications within our part of the company are built within a single repository, and we ignore this problem until we get OOMs, it would be catastrophic. It would block all PR merges and deployments for all UI Software Engineers.

High-level goals and non-functional requirements (NFRs)

We should be wary of the possibility of abrupt failures that are unrecoverable and unexpected. For example, if we depend on incremental compilation and there is a change to a core library the cache could get busted for every single application. In case this happens, there must be a way of doing a full rebuild without OOM crashes and in a reasonable amount of time.
If it turns out that we’ve made a bad architectural decision, we want it to be possible for others without a tonne of context to be able to detangle and back out from these decisions relatively easily. Therefore, we should avoid making users dependent on implementation details and unnecessarily bespoke tooling. Additionally, there are a number of modern tools that are significantly faster and we might be interested in using, however they are uncommon and could be more difficult to contribute to.
We should find a way of horizontally scaling this even if we think that in the short-term vertically scaling by increasing the spec of commodity hardware or using faster tools might suffice.

The approaches

We’ve looked into a number of different techniques and tools including webpack 5’s persistent caching, Vite, Snowpack, smart ESM CDNs were discussed when we were considering ESM vs Module Federation, esbuild, Rush.js (and interesting supporting libraries @rushstack/package-deps-hash and backfill), Nx, etc.

Build source code faster

esbuild is roughly ~100x faster than other bundlers, however, that is presumably only the case if we were to use it for bundling, transpilation and minification. If we are still going to use webpack as our bundler, and Babel as the transpiler, then we will not get as significant a speed-up from esbuild.

That said, it would be beneficial to use it for minification and perhaps transpilation. A consideration if using it for transpilation is that it is written in Golang and is the work of a singular developer. It might be harder for engineers to contribute to and presumably there are fewer people checking the implementation of transpilation rules at PR.

Do less work

Ideally, we’d not need to transpile or bundle at all — browsers would support modern syntax, and corporate networks wouldn’t cause us any trouble.

This is some of the promise of tools like Snowpack, which avoid transpilation and bundling, and ship ES Modules. It’s a good idea but in practice we don’t think the ecosystem is quite ready, and some of the solutions demand a more modern environment than we often find ourselves working with (for example, I’ve heard of HTTP/2+ being disabled on some corporate networks).

Incremental builds

NOTE: In order to have incremental builds, caching infrastructure and tooling will be mandatory.

Notably, Jenkins doesn’t come with caching tooling out-of-the-box, however, for at least the last 3 years it has been tablestakes in the most popular CI systems in use in open-source (CircleCI, GitHub Actions, TravisCI, Azure Pipelines, GitLab, etc). That’s true at least for caching node_modules/; for build outputs a team must presumably still configure a remote cache for storage.

The latest beta of webpack 5 has support for persistent caching, which would improve the speed of builds in the majority of cases by re-using previous work. I believe that this would need to be coupled to logic from Rushstack, Nx or Backfill to create a hash of what we use to build an application and to associate the build cache/outputs to this on CI.

Unfortunately, CRA doesn’t yet support webpack 5 (although I did start some work towards this upgrade back in March). Since Next.js currently supports webpack 5, if required we should have the necessary context to finish any upgrade to CRA.

The big issue with depending on incremental builds for speed is that if the build gets too large and then we inadvertently bust the cache with a change to a core library, we could end up in a situation where there needs to be a full re-build but we can’t do this because it takes too long or in the worst case scenario crashes with OOM errors.

Build applications lazily

So far each option has presumed that you would build an application upfront during your CI process, but what if you could build application source code lazily at the point that it’s required?

Since in many cases ‘builds’ are never deployed it makes sense that we wouldn’t want to pay the cost of bundling JavaScript and assets until they are needed.

This is what smart CDNs like skypack.dev (previously cdn.pika.dev) do. To a certain extent, it’s also similar to how Next.js makes rendering lazy, although as mentioned below the earlier code example, we pay the build time cost of all lazy imports on a page even if they are never rendered.

This approach was also brought up by @threepointone on Twitter here.

Potentially, if we were to do this, we could look into rewriting the onDemandEntryHandler that is used by Next.js for on demand building of assets in development mode (and HMR).

There are probably cons to this approach that I haven’t considered. My immediate concerns are that (1) if there would have been transpilation/bundler errors during CI we’ve pushed these to the runtime, and (2) if we don’t ‘warm’ these lazily builded imports by building them at startup, we could end up with the runtime of the application being stalled by these requests as they are built in the background.

Separate applications into multiple builds and join them together using module federation

You can read about module federation here. It’s a feature that allows you to split a build up into multiple build outputs which are then stitched into a single application at runtime. The main drawbacks are that (1) the integration is complicated enough that we could accidentally create lock-in if we’re not careful, (2) forgetting to share modules which contain singletons will break your application at runtime, and (3) because webpack is only considering parts of the module graph at a time and you must explicitly opt-in to sharing modules, vendoring and chunking are less effective and the bundle size will not be optimal.

Closing Remarks

Combining multiple approaches together might be the best approach.

During a discussion with @NMinhNguyen he mentioned that we could use a pattern like import(/* moduleFederation: true */ 'view') to opt-in to building an application independently from the main application without needing special webpack configuration. Could we achieve this without needing to use comments that meaningfully change runtime behaviour? Could we do this automatically without needing to specify this? Could a server then lazily build each of these imports at import time, similarly to how Next.js builds a page URL on demand when in development mode, or do we need to find a way to parallelise on CI? Would this be too slow for a server request unless each import / view was pre-warmed? Could individual module federated views share the same cache or would they each have their own?

I’m unsure about the right solution and it could be very different from what I am envisaging but I do have a few opinions:

We should probably use webpack as a base and not move to ESM-first tooling yet. (It’s the conservative option and allows us to benefit from the mature ecosystem; it might be the wrong choice if Rollup had a dev server (see nollup's reason for existing), or if Snowpack/Vite were getting more buy-in from other companies.)
We should make sure that we have caching tooling/infrastructure that we can persist build outputs and caches into.
Module federation could help us avoid a situation in which the webpack module graph gets so large that it causes an OOM crash.
We could look into swapping Terser and Babel out for esbuild since this would allow us to scale for longer without implementing everything else.

sebinsua · 2020-10-01T13:05:29Z

^ This is a bit of a long comment so I'm going to attempt to prioritise and split off the ideas into separate GitHub issues.

With regards to 'Build Performance' I would personally prioritise the work as follows:

Incremental builds
Build source code faster
Separate applications into multiple builds and join them together using module federation
Build applications lazily
Do less work

(We might not need to do all of these things immediately of course.)

threepointone added discussion Discussion topic enhancement New feature or request labels Sep 6, 2020

threepointone changed the title ~~[stub] incremental builds / persistent caching / "modular go fast"~~ "modular go fast" Sep 10, 2020

elischutze changed the title ~~"modular go fast"~~ "modular go fast" - Reducing task times in modular projects - Umbrella Issue ☂️ Sep 10, 2020

sebinsua mentioned this issue Oct 1, 2020

Self-hosting the project / dog-fooding #120

Closed

sebinsua mentioned this issue Oct 6, 2020

Build & Test Caching / Incremental Builds / "Modular Cloud" (Remote Computation Cache) #121

Closed

LukeSheard closed this as completed Jul 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"modular go fast" - Reducing task times in modular projects - Umbrella Issue ☂️ #62

"modular go fast" - Reducing task times in modular projects - Umbrella Issue ☂️ #62

threepointone commented Aug 31, 2020 •

edited

Loading

sebinsua commented Sep 15, 2020 •

edited by NMinhNguyen

Loading

sebinsua commented Oct 1, 2020 •

edited

Loading

"modular go fast" - Reducing task times in modular projects - Umbrella Issue ☂️ #62

"modular go fast" - Reducing task times in modular projects - Umbrella Issue ☂️ #62

Comments

threepointone commented Aug 31, 2020 • edited Loading

sebinsua commented Sep 15, 2020 • edited by NMinhNguyen Loading

Build Performance

The problem

The impact of not solving the problem

High-level goals and non-functional requirements (NFRs)

The approaches

Build source code faster

Do less work

Incremental builds

Build applications lazily

Separate applications into multiple builds and join them together using module federation

Closing Remarks

sebinsua commented Oct 1, 2020 • edited Loading

threepointone commented Aug 31, 2020 •

edited

Loading

sebinsua commented Sep 15, 2020 •

edited by NMinhNguyen

Loading

sebinsua commented Oct 1, 2020 •

edited

Loading