Skip to content
This repository has been archived by the owner on Dec 19, 2020. It is now read-only.

Explore consolidation of projects to reduce overhead #88

Closed
wincent opened this issue Sep 18, 2019 · 9 comments
Closed

Explore consolidation of projects to reduce overhead #88

wincent opened this issue Sep 18, 2019 · 9 comments
Labels

Comments

@wincent
Copy link
Contributor

wincent commented Sep 18, 2019

This is just a place to reflect some discussion that we had about how to reduce the overhead of developing in, testing, and releasing our open source projects.

We notice, for example, two costs:

  1. Because projects are distributed across repositories, it is easy for dependencies to drift out of sync. When we eventually integrate projects in liferay-portal (or sooner, in a project that will eventually go to liferay-portal), we find duplication in our lockfiles. This swells our dependency footprint and can be quite hard to remedy because the dependencies are not all declared in one place.
  2. Releasing (and testing) projects with long dependency chains can be a pain (for example, releasing liferay-js-toolkit which contains a lot of packages, and then having to update liferay-npm-scripts etc). We have a number of these dependency chains in various places.

The observation I'd like to make here is that:

pain = number of separate repositories x number of NPM packages

(Take that with a grain of salt: you could read it more loosely as "pain is proportional to")

In others words, we can reduce pain by reducing either of those multiplicands; either by:

  • Consolidating repos (insofar as is practical) into a larger monorepo; or:
  • Avoiding externalizing dependencies as independent NPM packages when they could in fact be internal "dependencies" (that is, modules/files) within another project.

There are obvious costs to this (it is not free) — and limits too (there are some things that must always be independently consumable).

Example drawbacks:

  • As we merge repos together our issue tracker(s) may become noisier and harder to find things in.
  • "Branding" for projects may be diluted if they get housed alongside other projects (the value of this branding is a whole other topic that we could discuss).
  • Customers may have trouble knowing where to look and where to report problems (although having fewer issue trackers may actually help there).

Anyway, just wanted to get the discussion rolling. Some projects that we should start looking at are probably liferay-npm-tools, liferay-js-toolkit, and liferay-js-themes-toolkit, for staters. Not looking at Clay or AlloyEditor (etc) yet, although in theory, in the long term almost anything is possible.

@wincent wincent added the rfc label Sep 18, 2019
@wincent
Copy link
Contributor Author

wincent commented Sep 18, 2019

Other topics along these lines of development efficiency may include the role of Lerna in some of our repos (whether it provides much value in light of Yarn workspaces; in the projects where we've removed it we are quite happy), and whether we should be versioning things en bloc or independently.

@wincent
Copy link
Contributor Author

wincent commented Sep 17, 2020

When I originally wrote this I was mostly thinking about the dependency of packages in the liferay-npm-tools repo on packages in the liferay-js-toolkit, and the associated proposal was to merge those two repos into one.

Since then, things have become more entangled, as I was reminded here, when I had to troubleshoot duplication in the lockfile caused by:

  • liferay-npm-tools depending on stuff from liferay-js-toolkit
  • liferay-npm-tools depending on stuff from liferay-js-themes-toolkit
  • liferay-js-themes-toolkit depending on stuff from liferay-js-toolkit

So, in this case I was able to fix things by blowing away the lockfile and regenerating, but it's worth noting now that we've exposed ourselves to potentially higher costs when releasing in the future:

  • Previously, to integrate toolkit changes in DXP, we had to coordinate the release of the toolkit packages (30 packages), followed by released two packages from liferay-npm-tools (the preset + the scripts).
  • Now, in a worst case scenario, an update might require us to make releases of the toolkit, the themes toolkit, and then the scripts.

So my initial idea was a modest proposal to merge the "tools" and "toolkit" repos, but I'm starting to think that maybe we should be more aggressive about this and just create one big monorepo for all of our frontend projects. Basically all of the same arguments apply, but just more "big-ly" — both the costs and the benefits would be larger (although to keep things in perspective, such a "monorepo" would be a tiny % of a project like the size of DXP itself, or the monorepos famously run by big FAANG-style companies and their ilk).

Costs:

  • This is a big move, which would have to be done incrementally and would take a long time to complete.
  • Load on the issue tracker and pull requests queue would increase, requiring us to level-up our labeling, searching, and organizational practices.
  • "Branding" for individual projects would be deemphasized in favor of more of an overarching "Liferay Frontend" brand.
  • Branching model may be challenging due to impedance mismatch between projects that currently maintain multiple long-lived release branches (eg. alloyeditor v2/v3, toolkit v2/v3, themes toolkit v8/v9/v10) vs projects which just have one active branch (eg. tools, eslint-config-liferay etc). A project like Clay would officially fall into the first category (v2/v3) but in practice on v3 sees active development.

Benefits:

  • No more dependency update hell caused by cross-repo coupling.
  • Ability to standardize on linting, testing, building (etc) tooling across all projects.
  • Ability to standardize practices around security audits, releases and other processes.

@bryceosterhaus
Copy link
Member

I don't interact with these projects often enough to have a big opinion on this. From a high-level view though, it seems worth it to at least attempt to consolidate our tooling like this. The branching problem gives me the most fear and hesitation, but I think its at least worth a shot. I think this approach also helps visibility on our team as well, gives a more central place to view everything going on for portal infra rather than many interwoven npm dependencies that can be hard to track down across multiple repos.

@wincent
Copy link
Contributor Author

wincent commented Sep 18, 2020

The branching problem gives me the most fear and hesitation, but I think its at least worth a shot.

Yep. And I have only some rough ideas for dealing with it at this point. 😬

Licensing is another factor that I forgot to mention: on the benefits side, we'd have a single place to audit our legal stuff; on the costs side, we'd have to think the details through carefully on how to merge projects because we currently have a mix (MIT, GPL, BSD mostly) — relicensing is an option but I'd prefer to defer that because it would likely be very process-heavy, so it would mostly be about preserving the individual licensing set-ups in the relevant subdirectories.

I think this approach also helps visibility on our team as well, gives a more central place to view everything going on for portal infra rather than many interwoven npm dependencies that can be hard to track down across multiple repos.

This reminds me of something else I forgot to mention: we actually have a history of "stealth" packages that suffer from one or more of the following problems:

  • Not hosted under github.com/liferay organization; some are hosted on personal developer accounts, which makes them hard to discover.
  • Not correctly configured with the right ownership; this means that if somebody is unavailable or leaves the company (or worse), we may lose the ability to publish the package at all.
  • Some packages have been published at times without proper Git tags, or worse still, from a local developer machine without the corresponding artifacts ever being pushed to a public Git server.
  • Lacking correct metadata in npm registry that would enable us to track down sources in the event that the location is non-obvious (ie. due to one of the reasons above).

All of those problems would obviously be significantly improved if we had a monorepo.

@izaera
Copy link
Member

izaera commented Sep 18, 2020

I don't know what to say... Maybe: try it and see what happens.

The only bell that rings, in my case, is that, if we try this, we should do it after internal tools are all made publicly usable. I'll explain: the JS Toolkit, for example, has code to build customer projects (outside of portal's source tree) as well as internal ones (those in portal's source tree), however npm tools only work for portal's source tree. Thus, the target user is different and the projects behave differently when it comes to issue handling and releases.

So, it's not only what happens because of having the code in one single source tree (which I think wouldn't be a problem), but also that we will have:

  • All the issues in one single place
  • All the releases in one single place
  • One wiki
  • ...

This can be good, but can be a nightmare too (for us and/or the customers). Imagine looking for a release in the releases tab. Or trying to classify the issue mails in our inboxes...

What I'm trying to say is -probably- that it's not only us, but also the rest of the world. How this is going to affect them...

Maybe we can try to do it in phases 🤔 For example: begin by merging the Themes, JS and npm tools (all SDK projects) into one. Then maybe merge some libs like Clay and senna (for instance) in another repo. See what happens, and then continue merging or stop at one point where we consider granularity is enough.

@jbalsas
Copy link

jbalsas commented Sep 18, 2020

So my initial idea was a modest proposal to merge the "tools" and "toolkit" repos, but I'm starting to think that maybe we should be more aggressive about this and just create one big monorepo for all of our frontend projects.

thering

One big monorepo to rule them all.

I would leave library-type projects on their own, such as Clay and soon-to-be-archived-AlloyEditor. We could debate others like liferay-frontend-ckeditor, but I think that one could probably have even more reasons to stay on its own.

I would suggest we start moving tooling projects into the liferay-npm-tools repo. We could try to find a better name for it like liferay-js-tools or liferay-frontend-tools...

We definitely need a branching strategy before starting, though, or we could find ourselves in big trouble some time from now when if we need to patch "old" versions of the packages. I'd recommend we only consolidate current branches and keep the old repos around as they are for those older support branches to reduce the overhead and the need to update unnecessary code.

@wincent
Copy link
Contributor Author

wincent commented Sep 18, 2020

isildur

I have some notes I made in a separate doc so as not to spam the tread too much. I'll polish them up into an actual detailed plan and share.

@wincent
Copy link
Contributor Author

wincent commented Sep 21, 2020

Here's the draft of "the plan"... not super detailed as yet because some of the things are probably best figured out along the way, but I will try to carve out some time to do some experiments and make some aspects of it more concerte.

@wincent
Copy link
Contributor Author

wincent commented Sep 24, 2020

I'm closing this. I'm part-way through the migration now (see https://github.com/liferay/liferay-frontend-projects), so if we want to talk more about "Exploring consolidation of projects to reduce overhead", we can do it in an issue over there... 😀

@wincent wincent closed this as completed Sep 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants