Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More robust fuzzy finding #2290

Closed
lukepighetti opened this issue Apr 26, 2022 · 15 comments · Fixed by #3969
Closed

More robust fuzzy finding #2290

lukepighetti opened this issue Apr 26, 2022 · 15 comments · Fixed by #3969
Labels
A-helix-term Area: Helix term improvements C-enhancement Category: Improvements

Comments

@lukepighetti
Copy link

Given a piece of text

packages/event_service/pubspec.yaml

You cannot match it with the query event service pubspec

It would be nice if fuzzy finding was robust enough to handle this type of use case

@the-mikedavis the-mikedavis added C-enhancement Category: Improvements A-helix-term Area: Helix term improvements labels Apr 26, 2022
@the-mikedavis
Copy link
Member

Currently the ordering of the tokens matters for the fuzzy find. A way around this with the current matching behavior is to use CtrlSpace which filters the current selection. So you retain your current set of matches and the filter resets: anything you type after CtrlSpace applies to that filtered subset.

But a more flexible fuzzy matching could be nice if it's still efficient to compute.

@lukepighetti
Copy link
Author

OK, event Ctrl+Space service Ctrl+Space pubspec does match it, but I lose the context and I don't see a way to start over or reverse these actions. I could start over by closing and reopening the fuzzy find window.

@BadBastion
Copy link
Contributor

BadBastion commented Apr 27, 2022

I personally love fuzzy matching, but am not a fan of some of the subtleties.

Currently, I am working on a branch where / strictly matches via regex like normal file navigation. <space> can be used to break up parts of the fuzzy query. \ can be used to escape chars.

IE:

  • ./packages/event_service/pubspec.yaml Matches exactly the file
  • /event_service .yaml matches all yaml files in a folder called event_service
  • event service pubspec fuzzy matches the way you expected.
  • pack/ even/ pubspec will match ./pack/even/pubspec.yaml but not pack\ even\ pubspec.yaml

My hope is this syntax not only helps deal with files, but also doubles as a more effective symbol picker and grep!

Anyway, that is my sales pitch. Thank you for attending my TED talk. Looking forward to some feedback 😄

@lukepighetti
Copy link
Author

lukepighetti commented Apr 27, 2022

Just speaking generally, I think VSCode Command+P file search is a good comparison here. My goal with fuzzy finding is to find a file whose exact name and location is unknown to me. This working well in a CLI editor is of utmost importance because file tree contextualization and exploring tools are typically not as robust as in GUI IDEs.

Let's say I am assigned a ticket for a part of the codebase I've never touched before. I know the feature has to do with the event service dependencies event service pubspec is the query and the result should take me to the pubspec.yaml file associated with the event service, if I'm lucky.

VSCode
Screen Shot 2022-04-27 at 7 19 06 AM

I think there's benefit to have this Just Work™ in helix

@BadBastion
Copy link
Contributor

@lukepighetti Precisely, this happens to be exactly how VSCode fuzzy search works.
The only exception being VSCode only recognizes ~/ for root folders. Intuitively I think./ should also be recognized.

@valpackett
Copy link
Contributor

fzy's matching and scoring algorithm is excellent. Looks like someone already did RIIR :) (crates)

@lukepighetti
Copy link
Author

lukepighetti commented Apr 28, 2022

fzy's matching and scoring algorithm is excellent. Looks like someone already did RIIR :) (crates)

Interestingly enough, fzy fails the event service pubspec test

trim.mov

@lukepighetti
Copy link
Author

lukepighetti commented Apr 28, 2022

If I can be totally honest with you all, I have always had terrible luck with fuzzy finders when it comes to usability in frontend search-as-you-type experiences. From a product sense, in most front-end applications, fuzzy finding always seems to lose out to an algorithm along the lines of:

  • lowercase needle and haystack lines
  • split needle and haystack lines into fragments on everything not alphanumeric
  • match a haystack line if any fragment contains any needle fragment
  • score based on number_words_matched_from_start * 2 + number_words_matched_in_middle

And you can adjust that score on relevance criteria like number_times_accessed_past_month, or distance_to_current_file_via_import_graph

@valpackett
Copy link
Contributor

@lukepighetti hm actually - why are you typing spaces? I think something like "evservpubspec" might work better with most matchers?

@lukepighetti
Copy link
Author

That's not intuitive to me.

@lukepighetti
Copy link
Author

lukepighetti commented Apr 28, 2022

I tried eventservicepubspec in helix's fuzzy file finder, and it works.

Screen Shot 2022-04-28 at 4 07 09 PM

Screen Shot 2022-04-28 at 4 09 18 PM

What is the effect of spaces on these fuzzy finding algorithms? Is there any reason why we shouldn't strip whitespace from the input if it matches better?

@the-mikedavis
Copy link
Member

Oh whoops I misread the initial issue - I thought it was about the ordering of the tokens rather than whitespace. Whitespace is currently considered to be part of the match, so if you have a file name with spaces in it the whitespace will match towards those.

Depending on the type of project you're working on, whitespace may be very uncommon. Maybe it would make sense to strip the whitespace before fuzzy matching? On the other hand, it's pretty straightforward to just not type the whitepace.

@lukepighetti
Copy link
Author

lukepighetti commented Apr 28, 2022

I think there are two ways of looking at this.

  1. make it work the way vim/kakoune/emacs folks expect it to work
  2. make it work the way VSCode/Jetbrains folks expect it to work

The reason I'm here personally is because helix has been the easiest CLI editor for me to onboard to from VSCode with all the fantastic built in tooling. My personal desire is to see it work in a way that is intuitive to all, not just intuitive to CLI folks. But that's just my personal context.

As a side note, never once have I seen whitespace match in a search-as-you-type feature in a mobile or web app. It's usually used to split tokens.

@the-mikedavis
Copy link
Member

The fuzzy-finding matcher is not token based so a character to split tokens is unnecessary. To fix this case we'd have to discard the whitespace which would reduce functionality: you could no longer use whitespace to match whitespace.

@BadBastion
Copy link
Contributor

BadBastion commented Apr 29, 2022

Using a space to find a file will likely occur a great deal less often than users will confused by the current behavior of <space>. As far as I am aware fzf has become the most common fuzzy finder in the vim ecosystem and its behavior is more inline with popular editors such as VSCode and Intelij. Beyond that, every terminal user will already be familiar with using \ for literal spaces.

Using ~, / and <space> semantically instead of literally seems to have become the general preference among users of other editors. As it stands, I don't think the behavior of the fuzzy finder follows the principle of least surprise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-helix-term Area: Helix term improvements C-enhancement Category: Improvements
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants