Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to accelerate modernization of the ecosystem #513

Closed
bajtos opened this issue Feb 21, 2022 · 7 comments
Closed

How to accelerate modernization of the ecosystem #513

bajtos opened this issue Feb 21, 2022 · 7 comments
Labels
package-maintenance-agenda Agenda items for package-maintenance team

Comments

@bajtos
Copy link

bajtos commented Feb 21, 2022

The Node.js ecosystem has a major problem: many of the most popular npm packages don't have adequate maintenance activity. They are not keeping up with the new Node.js features.

Besides the common OSS sustainability issues, I see a strong connection with the culture of Node.js core in 2012-2014. Around 2013/2014, the Node.js project was in a terrible state. The number of users was exploding, the number of active maintainers was declining. Node.js was running in many production envs, yet its version was still pre-1.0 (0.10, to be specific).

The ecosystem was mirroring the situation in Node.js core. Many of the most popular packages were maintained by less than a handful of developers. They were quickly getting to the point where they could not sustain support load from the fast-growing number of users.

In 2014, the io.js fork happened. It was a big earthquake and took a lot of effort from many people and companies to clean up. Eventually, the new Node.js project emerged with more open contribution & governance policies.

Unfortunately, there was no such event in the npm ecosystem.

Many packages that become popular in Node.js 0.8-0.10 days are still stuck in the mindset of the Node.js project of that era. Too many users, too few active maintainers, no hope for improvement.

This poses many risks and issues:

  • If a security vulnerability is reported in a popular dependency, how quickly a patched version will be released?
  • Many packages are migrating to ES Module style only, dropping support for CommonJS consumers. Packages written in CommonJS are going to stop receiving updates (including security fixes) from dependencies that moved to ESM.
  • Node.js core features like AsyncLocalStorage and diagnostics_channel require a bit of help from modules to work correctly. E.g. packages implementing thread pools (DB clients) and custom queues (legacy Promise implementations) need to restore async context.
  • Many of the popular packages have very sub-optimal performance (e.g. Winston, Bunyan, but also Express). Modern alternatives (e.g. Pino) have much less overhead.
  • etc.

So, what can Node.js users do about it? Vote by their feet and choose dependencies backed by an active & diverse community. This applies not only to modules listed in project dependencies. It's even more important to show modern tools in the documentation and code examples.

Unfortunately, this modernization is progressing slowly. There is a kind of a vicious cycle: a lot of existing content shows legacy packages, thus people assume that's what everybody uses. As a result, they show the same legacy packages when they write new content (blog posts, user manuals). Here is a semi-recent example: https://medium.com/airtable-eng/investigating-node-js-performance-event-loop-and-network-i-o-part-2-e9d1a8d4da8a

I'd like to open a discussion on how can the Node.js project help to accelerate modernization efforts?

Let's keep the discussion constructive and positive.

  • We should acknowledge the role of projects like Express, Bluebird and Winston in the history of the Node.js ecosystem, Node.js would not reach its current popularity without them.
  • Please stay respectful to the maintainers of these projects. Working in a popular open-source project is extremely demanding, I am grateful for all the effort they have been putting into their projects.
  • Let's focus on where want to get (and why), not on what we want to leave behind.

References

https://twitter.com/bajtos/status/1493193807621468165
(The Twitter thread that started this whole idea.)

https://twitter.com/matteocollina/status/1483905860476801024

most people still code Node like it is 2013.

https://twitter.com/bengl/status/1484609453685104643

Back in continuation-local-storage days, maintainers were not always keen to fix CLS support because CLS was not a part of Node.js core. Now that AsyncResource is official, I think it's reasonable to expect all modules to support it.

Maybe, but also most of the modules where this really matters are abandonware, but still in heavy use in the wild.

Data points

package weekly downloads Jan 2022 Feb 2022 notes
bluebird 23.8M 25.6M No release since 2019
request 20.6M 21.6M Officialy deprecated since 2019
express 22.1M 24.7M Last semver-minor version in 2019
winston 9.6M 10.6M
@bajtos
Copy link
Author

bajtos commented Feb 21, 2022

Few additional thoughts.

I think the problem is primarily organizational, not technical. There are already good modern alternatives to old packages: Fastify and Nest.js for Express, Pino for Winston & Bunyan, native Promises and async functions for Bluebird and q. We are missing education on why should be these alternatives preferred, risks of not migrating, and detailed migration guides (preferably with some automation).

I see a few areas/directions to explore:

User-land Promise implementations

Are there any valid use cases where packages like Bluebird and q would be a better choice than native Promises?

If there are none, then can we work with their maintainers to officially deprecate these packages? Convince them this is the best action they can do for their users, help them document reasons for deprecation and recommended migration paths. Maybe help to write code-mods to automatically migrate most popular usage patterns?

We can also identify the most popular packages using user-land Promises and contribute pull requests to move to native promises. IMO, writing such pull requests could be a great opportunity for people who would like to get involved with Node.js & OSS, but don't know where to start.

Loggers & frameworks

TBH, I am not sure how to approach this area.

On one hand, we don't want the Node.js project to categorize packages as good and bad. On the other hand, I think the current status quo means that in 10 years time (see next-10 initiative), developers will be still building Express applications supporting HTTP/1.1 only.

A lot of Express popularity in the terms of download numbers is coming from high-visible projects like webpack-dev-server, apollo-server and gatsby using Express as the underlying HTTP framework. Would it be a sensible thing to help them migrate to something else?

When it comes to dev servers specifically, is it actually necessary for them to use a framework? Maybe Node.js tooling for building HTTP servers is too low level? If we added a few missing building blocks like routing, then maybe more tools could build directly on top of Node.js core?

This could help with code examples in docs too.

Take the blog post Investigating Node.js Performance as an example. The code is essentially this:

async function requestHandler({ requestIndex, req, res }) {
  const serializedBigObject = JSON.stringify(bigObject);

  res.on("finish", () => {
    // report duration    
  });

  res.send(serializedBigObject);
}

app.get("/", async (req, res) => {
  const requestIndex = ++requestCount;
  requestHandler({ requestIndex, req, res });
});

IIUC, the snippet needs two features: routing and a helper for sending JSON responses.

Now I know the Node.js philosophy is (or use to be) to keep the core minimal, so a proposal to add more features is likely to be controversial. On the other hand, other platforms like Go offer built-in HTTP server routing that's good enough for simpler use cases. One advantage of a larger stdlib is a reduced surface for supply-chain attacks. There is also prior art in small helpers like events.once, where one can argue they are not necessary in the Node.js core either, and yet they were added.

@BethGriggs
Copy link
Member

I think this group will have a lot to discuss on this topic - thank you for starting the discussion.

My initial thoughts were mostly in the context of Express:

We are missing education on why should be these alternatives preferred

In the past, I've heard from a number of development teams who even when presented with the concerns and limitations of Express, are very happy using it and would still opt to build new projects with it over another framework. As long as it continues to get timely security fixes, there seems to be an 'if it ain't broke, don't fix it' mindset. While it's not the conclusion many of us would come to when presented with the concerns and limitations, I think we should be mindful that it's a completely reasonable conclusion for their organisation/usage/situation. For this reason, I do question how strong our guidance can/should be.

One aspect that I am also conscious of is that there are ecosystem maintainers that could potentially be disadvantaged by any actions we take (even as far as a financial/livelihood impact if they're relying on a form of sponsorship).

Let's focus on where want to get (and why), not on what we want to leave behind.

I think that's a great way to drive the initial discussion, as it should help us determine what the scope of this team's involvement should be in this problem space. I'll add to the agenda for the next meeting, but hoping we'll get some async discussion before then.

@BethGriggs BethGriggs added the package-maintenance-agenda Agenda items for package-maintenance team label Feb 21, 2022
@bajtos
Copy link
Author

bajtos commented Feb 21, 2022

request (HTTP clients)

I find it particularly worrisome that request has ~20M weekly downloads, despite the fact that it has been deprecated two years ago and every npm install prints deprecation notices.

When we (the LoopBack team) were evaluating request alternatives back in 2019/2020, there was no clear winner. At that time, both got and axios were converting low-level errors from Node.js to FetchError in a way that discarded important low-level details. Axios had a brief period with no active maintainers back in 2019, it still hasn't reached v1.0 by today.

I can imagine this situation makes it difficult for package maintainers to upgrade their HTTP client from request to a supported one.

I am happy to see fetch coming to Node.js core, I think it could become the default choice for users moving away from request. However, we are far from the point where native fetch would be usable.

  1. ATM, it's an experimental feature behind a CLI flag
  2. Well-behaved packages support all major LTS versions of Node, which means they won't be able to use native fetch until Node.js 18 becomes the oldest supported version (May 2024).
  3. The current fetch implementation based on undici does not support HTTP/2. This could be a blocker.

On the other hand, I think the situation with HTTP clients is easier to navigate since request has been officially deprecated by its maintainers, so this initiative should be no harm to them.

@ljharb
Copy link
Member

ljharb commented Feb 21, 2022

I think the term “modernization” is a bit problematic; for example that term is often thrown around to try to imply CJS is somehow less modern or more legacy than ESM (it is not).

as for web server frameworks, simply using newer language features isn’t necessarily an improvement - for example, things that used sync generators for async request results, or things that use Promises for something that can fulfill more than once, etc.

@thescientist13
Copy link
Contributor

Yeah, I think less of a focus on technology / API choice and more of a set of hints / signals to look for when evaluating a project seems a reasonable way to approach this from this group's perspective.

So less about using http vs fetch but more about evaluating a package from the perspective of

  • is deprecated
  • uses non registry URLs in the dependency graph
  • etc

As already called out, "modern" can be subjective, but risk can be more objective, and something package managers could help with (deprecations, non registry tarballs, etc), and then a group can flesh out how to weigh the subjective measures

  • release / repo activity
  • number of dependencies
  • etc

@thescientist13
Copy link
Contributor

thescientist13 commented Jun 21, 2022

From our meeting today, @dominykas highlighted that we have a pretty solid and thorough deprecation guidelines doc in this repo, and in reviewing it was happy to see if captured a lot of good information and recommendations around how users of packages can self-audit a package / repository and pick up on any indicators / signals that may stand out from a risk assessment standpoint. 🎣

@bajtos
Would you (or anyone) be open to reviewing the doc and seeing if there are some improvements that can be generalized from your experiences evaluating projects in OSS so that others may learn from them as well? I think a PR into that doc that doesn't necessarily need to call out any one project in particular but still provides suggestions and indicators that we could all review, would be really useful.

Then together we could all make sure to socialize that doc more and ensure we have a well informed user base of NPM! 🤝

@mhdawson
Copy link
Member

We discussed in the package maintenance meeting today. It is an important conversation but agreed probably best to handle through updates to the deprecation guidelines and or discussion on more specific topics in different issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
package-maintenance-agenda Agenda items for package-maintenance team
Projects
None yet
Development

No branches or pull requests

5 participants