Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for negated ignore patterns #32

Open
webpro opened this issue Sep 9, 2024 · 16 comments · May be fixed by #70
Open

Support for negated ignore patterns #32

webpro opened this issue Sep 9, 2024 · 16 comments · May be fixed by #70
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@webpro
Copy link

webpro commented Sep 9, 2024

Would be great to support negated patterns in the ignore option.

Use case: respect files like .gitignore (or others like .prettierignore etc) when globbing.

E.g. globby has the gitignore option which does the .gitignore reading, parsing and evaluation.

I wouldn't expect tinyglobby to include this feature (to read and parse such ignore files), but would be great to support negated ignore patterns so tinyglobby could be extended to support the use case.

The same has been requested in both node-glob and fast-glob, but neither of them currently supports it. Refs:

It would make migration more interesting for globby (with gitignore: true) users. Also I understand tinyglobby wants to stay tiny so feel free to reject!

Here's an example simplified use case to clarify things a bit. Let's say we want to glob all **/index.js files in a workspace that has this .gitignore:

.yarn/*
!.yarn/patches
...

And files:

.
├── index.js
└── .yarn
    ├── patches
    │   └── index.js
    └── index.js

After reading the .gitignore file(s) and converting the patterns the options look like this:

glob(["**/*.js"], {
  dot: true,
  ignore: ["**/.yarn/*", "!**/.yarn/patches/**"],
});

Expected result:

["index.js", ".yarn/patches/index.js"]

Just saying, for users to stuff it all into patterns argument directly is likely not a good idea. I think this doesn't and shouldn't work:

glob(["**/*.{js,ts}", "!.yarn", ".yarn/patches/**/*.{js,ts}"], { dot: true })
@SuperchupuDev SuperchupuDev added the enhancement New feature or request label Sep 9, 2024
@SuperchupuDev
Copy link
Owner

interesting, does this affect any existing package that wants to migrate? (so i know if its a priority to add or not). while it would be nice to add, i'm not sure if it would break existing compatibility with fast-glob/globby due to it technically being a tinyglobby exclusive feature (despite the fact fast-glob wants to add it at some point)

@webpro
Copy link
Author

webpro commented Sep 9, 2024

interesting, does this affect any existing package that wants to migrate? (so i know if its a priority to add or not).

No, I'm not aware of existing packages that want to migrate and need this.

while it would be nice to add, i'm not sure if it would break existing compatibility with fast-glob/globby due to it technically being a tinyglobby exclusive feature (despite the fact fast-glob wants to add it at some point)

Globby has support if gitignore: true is set, and then it's potentially a breaking change.

Alternatively, migrators/I could simply resort to globby's strategy: filter out glob results after the fact.

And just to explain my case: I'm using fast-glob in knip and I want to properly support .gitignore files. Knip used globby before, but things got unbearably slow in some large (mono)repos and I decided to accept a PR to replace globby, knowing that I was also prioritizing speed over the feature we're discussing (and rarely seems to cause issues). Yet, now that performance is bearable, and now that I've just added support for respecting .gitignore files in up-the-tree directories, I was thinking... wouldn't it be nice to have it all 😇 Technically I could argue I'm still migrating 😃

@etnoy
Copy link

etnoy commented Oct 7, 2024

I'm also curious about it. Over at Immich (https://github.com/immich-app/immich) I'm looking to switch to this library. Currently we use fast-glob and need to filter by file extensions when crawling photo libraries. The only way we can do this now is to add a similar brace expansion hack like @webpro mentions. With an "include" parameter, we could have a way to crawl multiple directories while also wanting to return a specific pattern (supported extensions) from those directories.

@webpro
Copy link
Author

webpro commented Oct 8, 2024

Not sure how that's related to ignore patterns? You mention "include" parameter, but I think what's being discussed here is negated patterns within the (existing) ignore: [] option. The (curly) braces I see aren't a hack, they're a common pattern to filter file extensions, which seems to be what you're after (and is supported by any glob lib).

@benmccann
Copy link
Contributor

Looks like cspell had to rollback their switch to tinyglobby due to this not being present: streetsidesoftware/cspell#6167 (comment)

@SuperchupuDev
Copy link
Owner

SuperchupuDev commented Oct 16, 2024

are you sure it was due to this not being present? because i'm guessing whatever issue was reported in cspell worked in fast-glob before they switched

@benmccann
Copy link
Contributor

I'm just going off the comment there:

The issues is with the ignore option. If it contains a glob starting with ! no results are returned.

Perhaps I misunderstood this issue or that comment or the comment was mistaken as to the cause of the issue

@SuperchupuDev
Copy link
Owner

SuperchupuDev commented Oct 16, 2024

i'm not sure until someone can provide a minimal repro of the exact pattern that fails

@webpro
Copy link
Author

webpro commented Oct 17, 2024

My $0.02: ime fast-glob doesn't support negated ignore patterns either, but maybe the issue is that with fast-glob !negated ignore patterns yied the same results (as if such patterns are removed, this might give users the illusion "it works" if such patterns don't actually have matches). Using tinyglobby with the same gives different results? Did not actually verify, I don't see tests in tinyglobby's repo to verify.

@SuperchupuDev
Copy link
Owner

okay i see the issue cspell had: tinyglobby can return no results if a negative ignore pattern is used, while fg just ignores the negative pattern

@SuperchupuDev
Copy link
Owner

question:

considering these two already act in the same way:

glob(['*/**', '!src/**']);

glob(['*/**'], {
  ignore: ['src/**']
});

would you expect these two usages to be equivalent to each other? (assuming the issue was fixed)

glob(['*/**', '!src/**', '!!src/README.md']);

glob(['*/**'], {
  ignore: ['src/**', '!src/README.md']
});

cc @webpro

@SuperchupuDev
Copy link
Owner

ah, i see why fast-glob hasn't implemented it yet, it's not anywhere as straightforward as i thought it'd be. not sure if i can solve it. i can make it consistent with fast-glob though

@SuperchupuDev SuperchupuDev added the help wanted Extra attention is needed label Oct 17, 2024
@benmccann
Copy link
Contributor

Making it consistent with fast-glob would really help with migrating users over from that library, so sounds like a good idea to me.

@webpro
Copy link
Author

webpro commented Oct 25, 2024

ah, i see why fast-glob hasn't implemented it yet, it's not anywhere as straightforward as i thought it'd be. not sure if i can solve it.

Any pointers as to why this isn't straightforward?

So naively I'd say we could do the whole walk thing and then filter out matching negated ignore patterns after the fact. This wouldn't affect performance if not using such patterns. Would love to take a stab at this actually, but haven't looked into it yet.

i can make it consistent with fast-glob though

Looks like this is done, right? Great move!

@benmccann
Copy link
Contributor

What does a negated ignore pattern even mean? Ignore everything except it? But in that case, why don't you just pass it as the main pattern without the negation?

Oh, yeah, it looks like this is done! b43ffdf

@webpro
Copy link
Author

webpro commented Oct 25, 2024

What does a negated ignore pattern even mean? Ignore everything except it? But in that case, why don't you just pass it as the main pattern without the negation?

See the example in the OP. Your suggestion wouldn't work, as the regular ignore pattern would filter it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants