Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full-text list filters (aka. "simple search") #319

Closed
Tracked by #2692
molomby opened this issue Sep 17, 2018 · 4 comments
Closed
Tracked by #2692

Full-text list filters (aka. "simple search") #319

molomby opened this issue Sep 17, 2018 · 4 comments

Comments

@molomby
Copy link
Member

molomby commented Sep 17, 2018

Search is a common use case both from within the admin UI and for other consumers of the API.

This issues describes a simple form of full-text search, similar to that implemented in KS4 -- a single set of fields specified for a list which presents in GraphQL as a boolean filter (match or no-match), alongside the filters contributed by the fields. This functionality does not allow for any ranking information or matching snippets from the documents matched.

In KS5, this could again be added as a set of fields in the list config. The effect would be the creation (DB schema management allowing) of a full-text index for these fields and the addition of an argument for all multi-node instances of the list in the GraphQL schema.

Eg, for the top level query fields this looks like:

query {
  allUsers (search: "will") { id name }
}

There's a simple implementation of this in the current KS 5 (mongoose adapter) but it's hard coded to match against the name field (which may not exists) and it uses a case-insensitive regex filter (which cannot be indexed in Mongo).

Prior art: KS 4 does a decent job list search. It uses the search fields configured for the list to maintain it's own full-text indexes. The logic is complicated by the fact Mongo only allows a single full-text index per collection so if an existing index is defined, so if one already exists, the list search operations revert to the case-insensitive regex strategy.

@molomby
Copy link
Member Author

molomby commented Sep 18, 2018

@JedWatson is gunning for an even simpler version of this initially that just always does a case-insensitive regex search. It doesn't scale but for smaller projects it's ok.

@jesstelford
Copy link
Contributor

See also this as a possible initial implementation: keystonejs/keystone-5#352 (comment)

@MadeByMike MadeByMike modified the milestones: M2: MVP, MVP Jul 23, 2019
@MadeByMike MadeByMike removed this from the Beta milestone Sep 16, 2019
@molomby
Copy link
Member Author

molomby commented Sep 23, 2019

Expanding on previous notes -- any solution to this problem that doesn't leverage full-text indexes isn't going to scale well at all.

In Mongo (for example) you end up applying $regex filters against multiple columns. For this kind of multi-field filtering, you usually want to consider word boundaries as "OR" (so "the moon landing was faked" is applied as "the" OR "moon" OR "landing" OR "was" OR "faked"). So, if you're performing this search against 100,000 posts and your "search" configuration checks 4 fields (eg. authorName, title, content and summary), you need end up processing 100k * 5 terms * 4 fields = 2 million regex comparisons. Note, this operation cannot be indexed. Also note that, depending on how your interface works, you might be re-running this search with each keystroke. I've seen outages caused by far less.

For these reasons I believe the suggested initial implementation isn't a great approach, even in the short term. It'll work but as the number of items and fields you're searching grows, it'll very quickly start causing problems.

That's the bad news. The good news is this is an old, well understood and largely solved problem! The answer is full-text indexes; both Mongo and most relational DBs we care about support them. In addition to viable performance, indexes like this usually also give you support for "exact phrase" searching, stemming and other features we probably want.

In terms of configurability, full-text text indexes do have some limitations that will influence our design. Specifically, although you can index multiple fields, Mongo will only allow a single text index per collection. This is fine but it does push us towards a similar pattern in Keystone, ie:

  • Lists are configurable with a set of "search fields"
  • Setting this config on the list:
    • Creates a full-text index for those fields (or rather, asks the DB adapter to)
    • Enables the search argument for the lists multi-node GraphQL fields (eg. all${listKeyPlural} but also "to many" relationships that reference the list)
  • Possibly adds a search bar to the list view in the Admin UI?

Fyi, this is very similar to how we addressed the same problem in KS4. It'll be interesting to see how all this interacts with the field type and DB adapter interfaces though...

@stale
Copy link

stale bot commented Jan 21, 2020

It looks like there hasn't been any activity here in over 6 months. Sorry about that! We've flagged this issue for special attention. It wil be manually reviewed by maintainers, not automatically closed. If you have any additional information please leave us a comment. It really helps! Thank you for you contribution. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants