Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't search query correctly with non-ASCII chars #2622

Closed
kyoshidajp opened this issue Jun 23, 2018 · 8 comments · Fixed by #3908
Closed

Can't search query correctly with non-ASCII chars #2622

kyoshidajp opened this issue Jun 23, 2018 · 8 comments · Fixed by #3908

Comments

@kyoshidajp
Copy link
Member

Issue Summary

Can't search query correctly with non-ASCII chars.

Steps to Reproduce

  1. Make query which has non-ASCII chars name or description
  2. Search with non-ASCII chars

e.g.

There is a query which has non-ASCII chars ユーザ.

all_queries

Search with ユーザ, then no quries in the result.

search_query1

When I search with ユー, then hit correctly.

search_query2

I guess that Query.search(in #2041) has changed this behavior. But I have no idea what we should fix it with keeping full text search feature.

Technical details:

  • Redash Version: master
  • Browser/OS: Version 67.0.3396.87 (Official Build) (64-bit)
  • How did you install Redash: Docker
@RichardLitt
Copy link

Possibly related to #2618.

@jezdez
Copy link
Member

jezdez commented Sep 17, 2018

Yeah, the recently udpated query full text search is based on Postgres' built-in textsearch extension which will use the "simple" configuration (parsers, templates, dictionaries) which only applies lower case and removes stop words from the content body while searching.

Unfortunately by default it only comes with support for a few Indo-European languages and misses others such as Korean, Japanese and Chinese (and more).

To add support for this, we'd need to add additional support for those languages, for example PGroonga, which supports all languages, but requires a 3rd party extension. The tutorial gives an idea how this would like, including for example the ability to just keep using ILIKE queries.

Alternatively we could move away the FTS from using Postgres altogether and switch to one of the many alternative search engines such as Elasticsearch, but that comes with a non-trivial amount of architectural changes.

@arikfr
Copy link
Member

arikfr commented Oct 28, 2018

Alternatively we could move away the FTS from using Postgres altogether and switch to one of the many alternative search engines such as Elasticsearch, but that comes with a non-trivial amount of architectural changes.

I wouldn't want to have ES as a mandatory dependency in Redash as it will make deployments harder. But maybe we can make this functionality pluggable:

  1. Have a hook for "index new content" (dashboard / query / other in the future) and "index updated content".
  2. Have an interface for performing a search.

By default the two will use Postgres, but will have additional implementation using ES, Algolia, other.

@jezdez
Copy link
Member

jezdez commented Oct 29, 2018

It would complicate the list views a bit since it's written right now to not differ between searching and just fetching the list of all items. I guess the API handlers can provide the interface to cater to that and ask the search backend to provide a list of item model IDs in the order of the search ranking and then fetch the appropriate date model items from the data base.

While I don't think it will be a huge deal, there is some overhead involved that we should probably be testing. E.g. support for pagination in the search backend would seem like a good idea.

@jezdez
Copy link
Member

jezdez commented Oct 29, 2018

BTW, would you consider making this something to be distributed in the Redash core, or as extensions?

@arikfr
Copy link
Member

arikfr commented Oct 29, 2018

The list view is the least of the complications this will create :) I'm more worried about permissions and similar concerns, data sync (between search engine and database) and other.

And yes, this can be an extension.

@deecay
Copy link
Contributor

deecay commented May 12, 2019

Hi @jezdez,

Could we bring back the naive and slow LIKE search as an option? Maybe an ENV variable LEGACY_FULL_TEXT_SEARCH or something to switch between two ways of searching?

For me, being able to search in multi-byte is far more critical than having faster and more modern tsvector textsearch.

@arikfr
Copy link
Member

arikfr commented May 12, 2019

@deecay adding an option to enable simpler search sounds good to me. Considering the global usage of Redash, I expect this to be popular enough to put in Organization Settings UI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants