-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't search query correctly with non-ASCII chars #2622
Comments
Possibly related to #2618. |
Yeah, the recently udpated query full text search is based on Postgres' built-in textsearch extension which will use the "simple" configuration (parsers, templates, dictionaries) which only applies lower case and removes stop words from the content body while searching. Unfortunately by default it only comes with support for a few Indo-European languages and misses others such as Korean, Japanese and Chinese (and more). To add support for this, we'd need to add additional support for those languages, for example PGroonga, which supports all languages, but requires a 3rd party extension. The tutorial gives an idea how this would like, including for example the ability to just keep using Alternatively we could move away the FTS from using Postgres altogether and switch to one of the many alternative search engines such as Elasticsearch, but that comes with a non-trivial amount of architectural changes. |
I wouldn't want to have ES as a mandatory dependency in Redash as it will make deployments harder. But maybe we can make this functionality pluggable:
By default the two will use Postgres, but will have additional implementation using ES, Algolia, other. |
It would complicate the list views a bit since it's written right now to not differ between searching and just fetching the list of all items. I guess the API handlers can provide the interface to cater to that and ask the search backend to provide a list of item model IDs in the order of the search ranking and then fetch the appropriate date model items from the data base. While I don't think it will be a huge deal, there is some overhead involved that we should probably be testing. E.g. support for pagination in the search backend would seem like a good idea. |
BTW, would you consider making this something to be distributed in the Redash core, or as extensions? |
The list view is the least of the complications this will create :) I'm more worried about permissions and similar concerns, data sync (between search engine and database) and other. And yes, this can be an extension. |
Hi @jezdez, Could we bring back the naive and slow For me, being able to search in multi-byte is far more critical than having faster and more modern tsvector textsearch. |
@deecay adding an option to enable simpler search sounds good to me. Considering the global usage of Redash, I expect this to be popular enough to put in |
Issue Summary
Can't search query correctly with non-ASCII chars.
Steps to Reproduce
e.g.
There is a query which has non-ASCII chars
ユーザ
.Search with
ユーザ
, then no quries in the result.When I search with
ユー
, then hit correctly.I guess that
Query.search
(in #2041) has changed this behavior. But I have no idea what we should fix it with keeping full text search feature.Technical details:
The text was updated successfully, but these errors were encountered: