Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deps: swap minimatch with micromatch #53547

Closed
wants to merge 16 commits into from

Conversation

danielbayley
Copy link
Contributor

@danielbayley danielbayley commented Jun 22, 2024

As mentioned in #51912 (comment) by @jonschlinkert:

I see that this is actually depending on minimatch, I would have preferred to see picomatch or micromatch given that minimatch is subject to catastrophic backtracking, and both picomatch and micromatch are more performant, support more patterns, and are better tested. Picomatch also has a tokenizer that can be used. Is there a reason minimatch is being used or can we do a PR to swap in micromatch?

Here are the benchmarks.

This PR aims to implement what should be a drop-in replacement of minimatch with micromatch, for reasons stated above.

This might be a notable-change?

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/actions
  • @nodejs/security-wg
  • @nodejs/tsc

@nodejs-github-bot nodejs-github-bot added dependencies Pull requests that update a dependency file. meta Issues and PRs related to the general management of the project. needs-ci PRs that need a full CI run. labels Jun 22, 2024
@danielbayley danielbayley marked this pull request as draft June 22, 2024 14:23
@danielbayley
Copy link
Contributor Author

danielbayley commented Jun 22, 2024

I’m not entirely certain why I am getting TypeError: Missing internal module 'internal/deps/./lib/picomatch', or where exactly the /./ section of the path is being introduced?

The tests pass if I do node --test test/parallel/test-fs-glob.mjs, but not ./node --test test/parallel/test-fs-glob.mjs or tools/test.py test/parallel/test-fs-glob.mjs

@RedYetiDev
Copy link
Member

RedYetiDev commented Jun 22, 2024

You always need to use full paths, starting from internal/.... Relative paths don't work internally.

(That being said I haven't checked the source of the error, but rather the message itself, so it's possible a different issue is causing the error)

@RedYetiDev RedYetiDev added meta Issues and PRs related to the general management of the project. and removed meta Issues and PRs related to the general management of the project. labels Jun 22, 2024
@benjamingr
Copy link
Member

benjamingr commented Jun 22, 2024

First of all thanks for your contribution.

cc @MoLow

A few issues:

  • First of all, we mostly don't add (JavaScript) deps without a blessing from their maintainers as it changes the project dynamic. So I'd like @jonschlinkert 's blessing before this is explored (like we got from minimatch's author before adding it as a dep)
  • Second, are there any caveats or stuff that stops working when we change the libraries? Were you able to run the tests and make sure they pass?

(As a side note https://github.com/jonschlinkert/maintainers-guide-to-staying-positive @jonschlinkert this is neat)

@benjamingr
Copy link
Member

The tests pass if I do node --test test/parallel/test-fs-glob.mjs, but not ./node --test test/parallel/test-fs-glob.mjs…

That means the tests broke :] node is your globally installed Node.js, ./node is the one you built.

You always need to use full paths, starting from internal/.... Relative paths don't work internally.

Note that internally, Node.js does not use the same loader for its own code. JavaScript in core gets bundled into the Node executable itself - so not all CommonJS loader options are supported (check the code under lib/ for examples)

@danielbayley
Copy link
Contributor Author

  • First of all, we mostly don't add (JavaScript) deps without a blessing from their maintainers as it changes the project dynamic. So I'd like @jonschlinkert 's blessing before this is explored (like we got from minimatch's author before adding it as a dep)

@benjamingr Sure, he literally requested it though 😆

@benjamingr
Copy link
Member

Ah lol, I didn't realize he left the original comment :D

@danielbayley
Copy link
Contributor Author

danielbayley commented Jun 22, 2024

You always need to use full paths, starting from internal/.... Relative paths don't work internally.

Note that internally, Node.js does not use the same loader for its own code. JavaScript in core gets bundled into the Node executable itself - so not all CommonJS loader options are supported (check the code under lib/ for examples)

Ok, so am I right in thinking we are talking about the relative paths within deps/micromatch/* now, and that’s why esbuild --bundle --platform=node is necessary, in the update-micromatch.shell script?

# update-micromatch.sh

"$NODE" "$NPM" pkg set scripts.node-build="esbuild ./index.js --bundle --platform=node --outfile=index.js --allow-overwrite"

@MoLow
Copy link
Member

MoLow commented Jun 22, 2024

if the motivation for this change is mainly performance can you please add a benchmark so we can compare the numbers?

@RedYetiDev
Copy link
Member

RedYetiDev commented Jun 22, 2024

if the motivation for this change is mainly performance can you please add a benchmark so we can compare the numbers?

FWIW According to https://github.com/micromatch/picomatch#benchmarks:

# .makeRe star (*)
  picomatch x 4,449,159 ops/sec ±0.24% (97 runs sampled)
  minimatch x 632,772 ops/sec ±0.14% (98 runs sampled)

# .makeRe star; dot=true (*)
  picomatch x 3,500,079 ops/sec ±0.26% (99 runs sampled)
  minimatch x 564,916 ops/sec ±0.23% (96 runs sampled)

# .makeRe globstar (**)
  picomatch x 3,261,000 ops/sec ±0.27% (98 runs sampled)
  minimatch x 1,664,766 ops/sec ±0.20% (100 runs sampled)

# .makeRe globstars (**/**/**)
  picomatch x 3,284,469 ops/sec ±0.18% (97 runs sampled)
  minimatch x 1,435,880 ops/sec ±0.34% (95 runs sampled)

# .makeRe with leading star (*.txt)
  picomatch x 3,100,197 ops/sec ±0.35% (99 runs sampled)
  minimatch x 428,347 ops/sec ±0.42% (94 runs sampled)

# .makeRe - basic braces ({a,b,c}*.txt)
  picomatch x 443,578 ops/sec ±1.33% (89 runs sampled)
  minimatch x 107,143 ops/sec ±0.35% (94 runs sampled)

# .makeRe - short ranges ({a..z}*.txt)
  picomatch x 415,484 ops/sec ±0.76% (96 runs sampled)
  minimatch x 14,299 ops/sec ±0.26% (96 runs sampled)

# .makeRe - medium ranges ({1..100000}*.txt)
  picomatch x 395,020 ops/sec ±0.87% (89 runs sampled)
  minimatch x 2 ops/sec ±4.59% (10 runs sampled)

# .makeRe - long ranges ({1..10000000}*.txt)
  picomatch x 400,036 ops/sec ±0.83% (90 runs sampled)
  minimatch (FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory)

@MoLow
Copy link
Member

MoLow commented Jun 23, 2024

Yes, those benchmarks exist, but I expect this to come first with benchmark files inside this project so we can use the same methodology and infrastructure we use for other benchmarks in the project

@danielbayley danielbayley changed the title deps: swap minimatch with picomatch deps: swap minimatch with micromatch Jun 23, 2024
@danielbayley
Copy link
Contributor Author

if the motivation for this change is mainly performance

Not only performance, but also other reasons, best outlined here: Why use micromatch? It’s basically a faster, more reliable superset of minimatch. My personal motivation is picomatch (and micro if/when {brace,expansion} is needed) are my go-to glob libraries, so having them in core is ideal.

I realised we should use micromatch here over pico, to maintain brace expansion… See also: library-comparisons.


add a benchmark so we can compare the numbers?

FWIW According to https://github.com/micromatch/picomatch#benchmarks

@RedYetiDev See also micromatch benchmarks:

# .makeRe star
  micromatch x 2,232,802 ops/sec ±2.34% (89 runs sampled))
  minimatch x 781,018 ops/sec ±6.74% (92 runs sampled))

# .makeRe star; dot=true
  micromatch x 1,863,453 ops/sec ±0.74% (93 runs sampled)
  minimatch x 723,105 ops/sec ±0.75% (93 runs sampled)

# .makeRe globstar
  micromatch x 1,624,179 ops/sec ±2.22% (91 runs sampled)
  minimatch x 1,117,230 ops/sec ±2.78% (86 runs sampled))

# .makeRe globstars
  micromatch x 1,658,642 ops/sec ±0.86% (92 runs sampled)
  minimatch x 741,224 ops/sec ±1.24% (89 runs sampled))

# .makeRe with leading star
  micromatch x 1,525,014 ops/sec ±1.63% (90 runs sampled)
  minimatch x 561,074 ops/sec ±3.07% (89 runs sampled)

# .makeRe - braces
  micromatch x 172,478 ops/sec ±2.37% (78 runs sampled)
  minimatch x 96,087 ops/sec ±2.34% (88 runs sampled)))

# .makeRe braces - range (expanded)
  micromatch x 26,973 ops/sec ±0.84% (89 runs sampled)
  minimatch x 3,023 ops/sec ±0.99% (90 runs sampled))

# .makeRe braces - range (compiled)
  micromatch x 152,892 ops/sec ±1.67% (83 runs sampled)
  minimatch x 992 ops/sec ±3.50% (89 runs sampled)d))

# .makeRe braces - nested ranges (expanded)
  micromatch x 15,816 ops/sec ±13.05% (80 runs sampled)
  minimatch x 2,953 ops/sec ±1.64% (91 runs sampled)

# .makeRe braces - nested ranges (compiled)
  micromatch x 110,881 ops/sec ±1.85% (82 runs sampled)
  minimatch x 1,008 ops/sec ±1.51% (91 runs sampled)

# .makeRe braces - set (compiled)
  micromatch x 134,930 ops/sec ±3.54% (63 runs sampled))
  minimatch x 43,242 ops/sec ±0.60% (93 runs sampled)

# .makeRe braces - nested sets (compiled)
  micromatch x 94,455 ops/sec ±1.74% (69 runs sampled))
  minimatch x 27,720 ops/sec ±1.84% (93 runs sampled))

Yes, those benchmarks exist, but I expect this to come first with benchmark files inside this project so we can use the same methodology and infrastructure we use for other benchmarks in the project

@MoLow I’ll look into adding these as the last step of this PR… Presumably following writing-and-running-benchmarks.

Also, looks like this will have to be rebased, and take into account #52881, which just landed…

cc: @jonschlinkert

@RedYetiDev RedYetiDev added path Issues and PRs related to the path subsystem. tools Issues and PRs related to the tools directory. labels Jun 23, 2024
@@ -133,7 +132,7 @@ class Pattern {
isLast(isDirectory) {
return this.indexes.has(this.last) ||
(this.at(-1) === '' && isDirectory &&
this.indexes.has(this.last - 1) && this.at(-2) === lazyMinimatch().GLOBSTAR);
this.indexes.has(this.last - 1) && this.at(-2) === lazyMicromatch().GLOBSTAR);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think micromatch has a .GLOBSTAR export.

Copy link
Contributor Author

@danielbayley danielbayley Jun 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think micromatch has a .GLOBSTAR export.

Yeah, do you think I should I add a definition somewhere to fs constants, or in an upstream PR to micromatch? Something like…

const GLOBSTAR = Symbol('globstar **');

What do you think? @RedYetiDev @jonschlinkert

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I'm not a core collaborator, but https://github.com/nodejs/node/blob/main/doc/contributing/using-symbols.md may help.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, do you think I should I add a definition somewhere to fs constants, or in an upstream PR to micromatch? Something like…

Let me know if you want me to help with anything

Copy link
Member

@RedYetiDev RedYetiDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a core collaborator, and these are my suggestions, but they aren't blocking.

(From here I'll leave the reviewing to the collaborators, I just wanted to get involved because this involved a change I recently landed)

lib/internal/fs/glob.js Outdated Show resolved Hide resolved
let micromatch;
function lazyMicromatch() {
micromatch ??= require('internal/deps/micromatch/index');
micromatch.GLOBSTAR = Symbol('globstar **');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be the redudant creation of a Symbol?
You are redefining it everytime we call Micromatch.

A better approach would be:

if (micromatch == null) {
  micromatch = require('internal/deps/micromatch/index');
  micromatch.GLOBSTAR = Symbol('globstar **');
}

Or maybe, define GLOBSTAR somewhere else for use here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be the redudant creation of a Symbol? You are redefining it everytime we call Micromatch.

A better approach would be:

if (micromatch == null) {
  micromatch = require('internal/deps/micromatch/index');
  micromatch.GLOBSTAR = Symbol('globstar **');
}

Maybe, or just ??=.

Or maybe, define GLOBSTAR somewhere else for use here?

I thought about that, as a constant elsewhere. But not yet sure if it should instead be a PR to micro/picomatch, or even if it’s strictly necessary to expose as a Symbol at all…

So the above implementation is most likely temporary, anyway.

lib/os.js Outdated Show resolved Hide resolved
@jonschlinkert
Copy link

(As a side note https://github.com/jonschlinkert/maintainers-guide-to-staying-positive @jonschlinkert this is neat)

Thank you!

  • So I'd like @jonschlinkert 's blessing before this is explored (like we got from minimatch's author before adding it as a dep

This definitely has my blessing. I'm willing to put in extra time or add collaborators to make sure it's maintained.

If there are specific things you want my help answering/resolving let me know.

@RedYetiDev
Copy link
Member

RedYetiDev commented Jun 24, 2024

If there are specific things you want my help answering/resolving let me know.

I have a few questions of my own, if you don't mind:

  1. There are cases, like https://github.com/nodejs/node/pull/53547/files#diff-926520e174d4e59a716c7367049fed207249e3732631177be409b34634a896afL205-L213 and https://github.com/nodejs/node/pull/53547/files#diff-092f2be30e312ebf7517f3d5f08b4c8d38a4bea3449ab15eb87f408a8f301dffL168-L176, which change the arguments (because micromatch and minimatch have different options), what options would make these feature-equivalent, or close enough?
  2. Currently, @danielbayley replaced just the loading of minimatch with micromatch, but I assume that the library has other features that might require additional changes (such as GLOBSTAR), could you review the code and look for cases where additional changes might be needed/better?

@isaacs
Copy link
Contributor

isaacs commented Jun 24, 2024

There's some context that seems to be missing in this conversation.

Performance

It doesn't matter. Really, in this case, it does not rise to the level that anyone should care about the performance one way or another. Making something a million or even a billion times faster won't make a difference if that thing is less than 0.01% of the overall time spent. (And picomatch is not 1M times faster than minimatch, anyway.)

As far as I know, glob pattern parsing is not directly exposed. It's only exposed in parsing the arguments to the test runner, and fs.glob.

The benchmarks referenced here are for the makeRe function, which is known to be slow, fundamentally cannot be correct, and thus is not used in actual matching. No one is calling Minimatch.makeRe a million times a second. Seriously. Think about this for one second. Then realize you could've spent that second unnecessarily turning a million globs into regular expressions using Minimatch, or 2 million using micromatch, and if you care about that, you'd be the only person on earth who does. (In fact, Minimatch doesn't even create a regular expression at all in many cases.) Also, unless they've been updated recently, these benchmarks are not actually testing the minimatch that was pulled into node core, but rather a release 2 semver majors prior.

The fs.glob implementation leaves a ton of performance on the table, as I mentioned in comments on the PR that added it. But even still, it parses the pattern exactly one time, then spends 99% of its overall execution time in fs system calls. It's not an exaggeration to say that you could make the match parser 10x slower and never notice the difference.

If the goal is to make fs.glob faster, then the algorithm needs to be changed to be more like that of glob or fast-glob.

If there's no benchmark showing a performance improvement for node (ie, not just a microbenchmark of a single arbitrary function pulled out of the API) then there is no performance improvement, and nothing to discuss.

Correctness/Completeness

There does not exist a 100% fully complete and correct "glob" implementation in JavaScript, for the simple reason that there is no 100% agreed-upon spec or definition of what exactly that means.

Thus, it is important (as I argued in the fs.glob PR) for any given platform to precisely identify how its globs work, rather than leaving it up to users' intuitions.

Minimatch (and really, glob) defines Bash 5.latest as the targeted reference implementation. There are a small number of cases where it cannot match Bash's behavior exactly, either due to performance considerations or just fundamental limitations of JavaScript regular expressions. Those cases are explicitly defined and tested for.

I considered using picomatch or micromatch in the glob 7 rewrite. I abandoned the idea when it became clear at the time that there was no way to correctly support every combination of braces, extglobs, and globstar patterns, with identical semantics to bash, without rewriting it entirely anyway. So instead, I rewrote minimatch to support these things properly and expose the interfaces glob needed.

If the goal of the project is to allow / characters in extglobs, to not support .. in patterns the same way bash does, or that !(9).txt should not match 999.txt, then that's a choice that can certainly be made and probably be justified. But it should be documented as a choice in the documentation, with the rationale for how and why node's fs.glob differs from the globs that you'd find on a command line, in a gitignore file, etc.

Do not assume that "the tests pass" means all edge cases are fully covered. The tests that node-glob and minimatch use to ensure coherence with Bash semantics are much more extensive than the glob tests in bash itself, for example.

See also:

Picomatch also has a tokenizer that can be used.

So does Minimatch.

minimatch is subject to catastrophic backtracking,
support more patterns, and are better tested.

Citation needed.

I don't care about OSS rivalry; too many people use my code as it is. If micromatch or picomatch are a better fit for node's goals, then they can use those with my blessing, of course. Please present information fairly and with full context so that the project maintainers can make rational decisions.

@jonschlinkert
Copy link

jonschlinkert commented Jun 25, 2024

I'm sorry to hear that you feel that way. My goal is to focus on the technical aspects, not to debate.

Instead, I'll just focus on the point I made. Minimatch is inefficient, starting with the smallest brace pattern, and it gets exponentially more inefficient with each character added to the string.

Here is a simple "benchmark" that people can try for themselves with the latest minimatch, v9.0.4, and the current micromatch.

const minimatch = require('minimatch');
const micromatch = require('..');

let start = Date.now();
console.log(micromatch.makeRe('foo/{1..1000000}/bar')); //=> 35 byte regex -> /^(?:foo[\\/][1-1000000][\\/]bar)$/
console.log(`micromatch.makeRe: ${Date.now() - start}`);
// micromatch.makeRe: 5ms

start = Date.now();
console.log(minimatch.makeRe('foo/{1..1000000}/bar')); //=> 16.9 MB regex
console.log(`minimatch.makeRe: ${Date.now() - start}`);
// minimatch.makeRe: 5099ms (~85s)

The output from minimatch was too large to paste into the issue.

If you add one more zero, minimatch freezes:

const minimatch = require('minimatch');
const micromatch = require('..');

let start = Date.now();
console.log(micromatch.makeRe('foo/{1..10000000}/bar')); //=> 35 byte regex -> /^(?:foo[\\/][1-10000000][\\/]bar)$/
console.log(`micromatch.makeRe: ${Date.now() - start}`);
// micromatch.makeRe: 8ms

start = Date.now();
console.log(minimatch.makeRe('foo/{1..10000000}/bar'));
console.log(`minimatch.makeRe: ${Date.now() - start}`);
// FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

It doesn't matter. Really, in this case, it does not rise to the level that anyone should care about the performance one way or another.

I agree, at least partially, with your perspective on this in a general sense. I've been frustrated with vulnerability-hunters like Snyk because they create contrived and very unlikely scenarios to demonstrate what they claim to be a vulnerability. Honestly, I had a vulnerability report from them on Enquirer, a prompting library for the terminal. What exactly is the attack vector? And why would anyone loop over a prompt 10,000 times?

I think those cases are silly. But this one isn't, because I'm demonstrating that when minimatch receives valid input without any loops, trickery, or external contrived scenarios, it causes a JavaScript heap out of memory error due to ineffective mark-compacts near the heap limit resulting in allocation failures. Anyone who actually sees the generated regular expressions firsthand will understand immediately what the problem is.

Sure, we could lop some zeros off of the pattern, but that misses the point: Minimatch and micromatch take completely different approaches to generating patterns, and this is just one convenient example to demonstrate it.

And picomatch is not 1M times faster than minimatch, anyway.

No, it's not that much faster. There might even be some benchmarks where minimatch is faster if you want to do a PR to add them.

Since you mentioned something about being outdated, here is minimatch@9.0.4 (latest AFAIK) against the current picomatch@4.0.2:

# .makeRe star (*)
  picomatch x 4,873,850 ops/sec ±0.85% (95 runs sampled)
  minimatch x 647,557 ops/sec ±0.72% (97 runs sampled)

# .makeRe star; dot=true (*)
  picomatch x 3,875,239 ops/sec ±0.68% (96 runs sampled)
  minimatch x 584,040 ops/sec ±0.63% (97 runs sampled)

# .makeRe globstar (**)
  picomatch x 3,625,167 ops/sec ±0.97% (94 runs sampled)
  minimatch x 1,876,719 ops/sec ±0.65% (97 runs sampled)

# .makeRe globstars (**/**/**)
  picomatch x 3,508,686 ops/sec ±0.71% (94 runs sampled)
  minimatch x 1,599,165 ops/sec ±0.64% (96 runs sampled)

# .makeRe with leading star (*.txt)
  picomatch x 3,368,935 ops/sec ±0.65% (97 runs sampled)
  minimatch x 445,354 ops/sec ±0.65% (95 runs sampled)

# .makeRe - basic braces ({a,b,c}*.txt)
  picomatch x 654,228 ops/sec ±1.05% (91 runs sampled)
  minimatch x 112,639 ops/sec ±0.63% (97 runs sampled)

# .makeRe - short ranges ({a..z}*.txt)
  picomatch x 564,278 ops/sec ±1.13% (91 runs sampled)
  minimatch x 15,214 ops/sec ±0.78% (95 runs sampled)

# .makeRe - medium ranges ({1..100000}*.txt)
  picomatch x 537,687 ops/sec ±1.05% (87 runs sampled)
  minimatch x 2.66 ops/sec ±7.63% (11 runs sampled)

# .makeRe - long ranges ({1..10000000}*.txt)
  picomatch x 549,302 ops/sec ±2.12% (94 runs sampled)

<--- Last few GCs --->

[9721:0x140008000]   128790 ms: Mark-Compact 4049.8 (4130.3) -> 4034.6 (4130.1) MB, 1025.75 / 0.00 ms  (average mu = 0.127, current mu = 0.015) allocation failure; scavenge might not succeed
[9721:0x140008000]   129880 ms: Mark-Compact 4050.5 (4130.3) -> 4036.6 (4132.3) MB, 1075.12 / 0.00 ms  (average mu = 0.072, current mu = 0.014) allocation failure; scavenge might not succeed

Again, minimatch freezes on the range expansion. Yes, that might represent a small number of cases, or be extremely unlikely to happen, but the benchmark isn't about the pattern. The point is to demonstrate how the underlying code was designed.


I think we should continue exploring this. It seems there is no good reason not to, even if there are "minor" differences in performance. Small optimizations compound geometrically, especially in large dependency trees.

@RedYetiDev
Copy link
Member

RedYetiDev commented Jun 25, 2024

In my opinion, this should first be opened as an issue, where the potential change can be discussed.

@benjamingr @MoLow WDYT?


Additionally, in my opinion, this is more complicated than a "drop in replacement", and I think that, if this is to be implemented, @jonschlinkert should be one to do it, as he knows the most about the micromatch.

(I don't mean this in any way against @danielbayley, your work has been excellent, and truly appreciated, I just think that a maintainer's touch is helpful)

@jonschlinkert
Copy link

Additionally, in my opinion, this is more complicated than a "drop in replacement", and I think that, if this is to be implemented, @jonschlinkert should be one to do it, as he knows the most about the micromatch.

I appreciate your perspective @RedYetiDev, I'm here, and offering my help so far as it's needed. I'm willing to do patches, or a major bump with whatever changes are necessary. But I probably wouldn't repeat the hard work that's already been done on this PR. Changes can be made, and commits can be squashed, and I'm happy to help, so long as the core team is open to exploring this. I

@isaacs
Copy link
Contributor

isaacs commented Jun 25, 2024

@jonschlinkert but this benchmark is still testing a known-limited convenience method that node doesn't use or expose. makeRe is irrelevant.

Show a meaningful difference in a benchmark of what node is actually doing, in the way it is doing it. Until and unless that is done, "performance" should not be a part of this conversation.

Optimizing a function that accounts for virtually none of the overhead, is just polishing doorknobs. Optimizing a function it doesn't even call is even sillier.

@jonschlinkert
Copy link

jonschlinkert commented Jun 25, 2024

Show a meaningful difference in a benchmark of what node is actually doing, in the way it is doing it. Until and unless that is done, "performance" should not be a part of this conversation.

Honestly this just sounds like an emotionally charged conversation, and I don't think it's going to benefit anyone for me to get into a battle with you over this. The community deserves better than this attitude.

If someone wants to talk about how we can make node more performant, I'm interested in that conversation.

@MoLow
Copy link
Member

MoLow commented Jun 25, 2024

as I said, the actual benchmarks we should look at are benchmarks for fs.glob and its variants. the internals are not as important, and I totally agree with @isaacs sentiment on that

@Trott
Copy link
Member

Trott commented Jun 25, 2024

Hi, everyone. I don't know the history here, but I'm really uncomfortable with all the talk about "lies" and so on and would really appreciate if we could focus on the content here.

Getting back to the content: I basically agree that if there's no relevant benchmark for Node.js showing a performance boost for the way we use minimatch, no existing ReDoS or other vulnerability, and no bug that this change fixes, then this seems like a very large (and therefore dangerous) change for no clear benefit. It does seem very unlikely that this will affect Node.js performance in any measurable way. Microbenchmarks of a single API don't seem relevant. I won't oppose something like this if there is consensus among collaborators, but I'm not seeing that either.

I appreciate the effort extended to make this happen, and I hope it's not too discouraging, but I don't expect this to land.

The big thing that would change my mind would be a benchmark for fs.glob added to benchmark/fs that showed this to be a big improvement. I don't expect that to be the result of such a benchmark though, since the amount of time spent doing I/O and other operations will overwhelm the time spent doing matches, or at least that is certainly what I would expect.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making my -1 explicit.

Copy link
Contributor

@bnb bnb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1. I'd be down if this had relevant benchmarks of Node.js with the two modules, showing an improvement of even >=10%. So far I've not seen that, happy to reconsider if those do show up. Otherwise, I'm wary of introducing dependency changes that our end-users won't obviously benefit from.

@Trott
Copy link
Member

Trott commented Jun 26, 2024

This has two hard blocks (mcollina, bnb) and at least two soft blocks (MoLow, me) with no collaborators expressing enthusiasm. Given that, I'm going to close it. However, if I'm just being hasty and you're putting together benchmarks right now and you feel like a persuasive case can be made, feel free to re-open (if the GitHub interface lets you) or leave a comment requesting that it be re-opened and I or some other collaborator or triager will re-open it. Thanks again for the PR. I know it's frustrating when you do a lot of work and a bunch of people who didn't help much are all "Nah, sorry." But in this case, I'm not sure what to do about that.

@Trott Trott closed this Jun 26, 2024
@RedYetiDev
Copy link
Member

RedYetiDev commented Jun 26, 2024

Given the close, I've removed some of the additional labels associated with the file changes.

This is so finding this issue may be simpler.

(Keeping fs because it's directly affected)

Feel free to re-add labels

@RedYetiDev RedYetiDev removed path Issues and PRs related to the path subsystem. tools Issues and PRs related to the tools directory. labels Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file. fs Issues and PRs related to the fs subsystem / file system. meta Issues and PRs related to the general management of the project. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants