-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add full text search for queries based on the Postgres tsvector type. #2041
Conversation
2d70074
to
6164efa
Compare
migrations/versions/5ec5c84ba61e_.py
Outdated
def downgrade(): | ||
conn = op.get_bind() | ||
|
||
ss.drop_trigger(conn, 'article', 'search_vector') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
article
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes!
redash/cli/database.py
Outdated
@@ -7,6 +7,8 @@ | |||
def create_tables(): | |||
"""Create the database tables.""" | |||
from redash.models import db | |||
# make sure the additional search mappers are configured | |||
db.configure_mappers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the search mappers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is misleading, this basically makes sure that all mappers that have been registered with the SQLAlchemy ORM have been properly configured. Since SQLAlchemy-Searchable installs database triggers that have to act on when ORM model instances are created (e.g. to track updates), this call is needed.
Here's what the SA-Searchable docs say:
It is very important here that you either access your searchable class or call configure_mappers before the creation of tables. SA-Searchable adds DDL listeners on the configuration phase of models.
Since this code happens in the phase in which tables are created, we can't really access the searchable class, but have to call configure_mappers
. I'll update the comment to reflect this.
redash/models.py
Outdated
where |= Query.id == term | ||
|
||
where &= Query.is_archived == False | ||
def search_queries(cls, term, group_ids, include_drafts=False, limit=20): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why append _queries
? It's a method on the Query
class isn't it obvious that it searches for queries? And in the handlers code you still call .search
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, yeah, this is just a leftover from figuring out how to use the search
method that the SearchQueryMixin
adds to the query_class
of the Query
model. Will fix.
redash/models.py
Outdated
from sqlalchemy.orm.exc import NoResultFound # noqa: F401 | ||
from sqlalchemy.types import TypeDecorator | ||
from functools import reduce | ||
from sqlalchemy_searchable import SearchQueryMixin, make_searchable, search | ||
from sqlalchemy_utils.types import TSVectorType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I mind much about adding sqlalchemy_utils
as a dependency, but doesn't sqlalchemy_searchable
provide something for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the SQLAlchemy-Utils package is from the same author as SQLAlchemy-Searchable and contains a whole bunch of things.
The code behind TSVectorType is trivial relatively speaking, so I guess we could also just ship our own. The only risk I see with this is that we could trade one dependency less for having to keep an eye on changes in SQLAlchemy-Searchable that require changes in the TSVectorType class as well.
On the other hand, although quite out of scope of this PR, is the question if there are things in SQLAlchemy-Utils that may be useful for Redash in general, given its swiss-army-knife-ness. I'm fine with whatever way you want to go with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 ok, let's use it. No need to find more places to use it, we can gradually introduce it where applicable.
migrations/versions/5ec5c84ba61e_.py
Outdated
nullable=True)) | ||
|
||
ss.sync_trigger(conn, 'queries', 'search_vector', | ||
['name', 'description', 'query']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this also update all existing queries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looking at the docs it says: "Updates all rows for given search vector by running a column=column update query for given table."
I've also manually added a bunch of queries against Redash's own Postgres database in the master branch, then switch to this branch, did the Alembic upgrade and then was able to use the search feature right away.
conn = op.get_bind() | ||
|
||
ss.drop_trigger(conn, 'queries', 'search_vector') | ||
op.drop_index('ix_queries_search_vector', table_name='queries') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a byproduct of ss.sync_trigger
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, although weirdly it raised a ProgrammingError exception when I tried roll it back, will look into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an oversight when setting up SA-Searchable. The migration that I fixed in eb6966f now reflects adding an index for the search vector column. Another call to db.configure_mappers()
was needed to have the mapper processor of SA-Searchable to correctly add the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Another call to db.configure_mappers() was needed
It seems that it's at the top level of redash.models
. So I guess the one in redash.cli.database
is redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True indeed, good catch, removed it in 36aeef7.
36aeef7
to
294cff0
Compare
@arikfr Just added a bunch of tests for the more advanced search queries as described on https://sqlalchemy-searchable.readthedocs.io/en/latest/search_query_parser.html. |
294cff0
to
a9cc685
Compare
@jezdez this is finalyl ready to be merged, but I think I saw some fixes you did to this code in Mozilla's fork. Are they present here already or this need an update? Thanks. |
@arikfr They aren't in this PR yet, I'll get them in today and let you know. Thanks for reaching out again! |
Super, thanks! |
f80e088
to
388ec5e
Compare
388ec5e
to
9f6b4fb
Compare
@arikfr Okay, I've updated the PR with the recent things in our fork and updated the design to match the query-list and my-queries page as well. Does this work? |
@arikfr BTW, apologies for taking so long :( |
@arikfr Any idea how we can retrigger the code climate check? It seems to be stuck. |
Thanks ! 👍 |
Add full text search for queries based on the Postgres tsvector type.
This needs some more tests with a prod like database and probably another
look for someone with more experience with custom alembic migrations.
This is related to #1400 of course and uses SQLAlchemy-searchable.