-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: fast exact name search by default #66
Conversation
This changes the default behavior of the `name` filter from "case-insensitive, partial match" to "case-sensitive, exact (full) match" and adds `match` parameter that enables opt-in into `fuzzy` strategy only when performance is not an object. The rationale for this change that fuzzy search is a bad default, does not scale well and in most cases users and developers expect exact match. We are moving away from slow defaults in go-ipfs, and since `ipfs pin local` API will be inspired by this spec, we don't want to re-introduce bad choices into IPFS ecosystem. Developers who need fuzzy search, can always opt-in into slower, but more flexible text matching strategy by passing `match` parameter, but the default should be the fastest option available, and in this case it is the exact match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this seems much more sensible from an ecosystem perspective.
ipfs-pinning-service.yaml
Outdated
default: exact | ||
enum: | ||
- exact # full match, case-sensitive (the implicit default) | ||
- fuzzy # partial match, case-insensitive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚲 + 🏚️ :
"fuzzy" I like that it's short, but not so happy that it's not quite precise in what it does. I'm fine with fuzzy, but putting out the call for better names before we decide we're just going with this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am totally fine with renaming fuzzy
to something better.
Food for thought: partial
, relaxed
, inexact
, loose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for partial
unless you specifically need to indicate it's case-insensitive too (if the latter, would suggest something like full-sensitive
and partial-insensitive
but that gets a little too bikesheddy, probably)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for partial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to partial
in 91299cc
(#66) and added match=exact|iexact|partial|ipartial
in
f96383a to fuel discussion
We need to figure out if thats enough, or do we want something more fancy like suggestion from #66 (comment)
ipfs-pinning-service.yaml
Outdated
default: exact | ||
enum: | ||
- exact # full match, case-sensitive (the implicit default) | ||
- fuzzy # partial match, case-insensitive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for partial
Do we need a |
@achingbrain I'd argue |
PR for implementation on the ruby pinning api server over here: ipfs-shipyard/rb-pinning-service-api#5 |
This PR looks great to me. The changes will be helpful in keeping things performant by default while also allowing for some flexibility. |
tldr
This PR changes the default behavior of the
name
filterfrom case-insensitive, partial match
to much faster case-sensitive, exact (full) match
and adds
match
parameter that enables opt-in into alternative strategy when performance is not an object:PREVIEW: https://ipfs.github.io/pinning-services-api-spec/#specUrl=https://raw.githubusercontent.com/ipfs/pinning-services-api-spec/fix/default-to-fast-name-match/ipfs-pinning-service.yaml
Rationale for this change
Case-insensitive partial "Fuzzy" search (
ipartial
) makes a bad default. (performance-wise)Database indexes and b-trees do not solve the performance fully.
Exact match is always faster and less expensive.
Fuzzy search may produce bugs
This PR introduces
match
query parameter, so developers who need fuzzy search can still opt-in into slower, but more flexible text matching strategy by passingmatch
parameter.We are moving away from slow defaults in IPFS ecosystem. Better to do this now, than later.
ipfs pin ls
, which is abysmally slow when one has >1k pins. Nearly all pins in real life are recursive, andipfs pin ls --type=recursive
executes instantly, but is rarely used due to not being the default.ipfs pin remote
andipfs pin local
APIs will be inspired by this spec and will havename
attribute, however we don't want to re-introduce bad choices into IPFS ecosystem. The default should be the fastest option available, and in this case it is the exactname
match.Implementation notes and migration plan
This change does not change the wire format and does not impact ongoing MVP integration in go-ipfs and ipfs-webui (Epic: Pinning service integration ipfs-gui#91, Add support for remote Pinning Services kubo#7559), but we want to include it in this spec to ensure MVP does the right thing.
Existing Pinning Services already implement the more difficult
fuzzy
ipartial
strategy. Adding support formatch
filter and implementing much simplerexact
strategy should be a small task, but will save everyone headache in the long run.@aschmahmann @jacobheun @petar @gammazero @obo20 @andrew @GregTheGreek @priom @jsign @sanderpick @andrewxhill @ipfs/wg-pinning-services