-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change: Replace Peewee with SQLAlchemy/Alembic #1417
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting 😃
return fn(*args, **kwargs) | ||
except DoesNotExist: | ||
rv = fn(*args, **kwargs) | ||
if rv is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQLA will never raise an exception for missing rows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its .get()
and .first()
methods return None when there's no entry, so I believe that's correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can get rid of this helper, as it seems that Flask-SQLAlchemy has its own: first_or_404 and get_or_404.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQLA will never raise an exception for missing rows?
See also .one()
and .one_or_none()
@@ -4,6 +4,7 @@ Flask-Admin==1.1.0 | |||
Flask-RESTful==0.3.5 | |||
Flask-Login==0.3.2 | |||
Flask-OAuthLib==0.9.2 | |||
Flask-SQLAlchemy==2.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In another project I used alchy which also has Flask-Alchy which is a drop-in replacement for it. The benefit is that we won't need Flask session whenever we're using the DB. The downside is that alchy never seemed to gain mind share unlike Flask-SQLAlchemy.
TL;DR: we can keep Flask-SQLA and see if it adds too much boilerplate to the jobs code. If it isn't, keep it. Otherwise consider alchy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suspicion is that it won't add boilerplate, calling create_app()
should be most of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: alchy, it seems the author's mostly moved onto https://github.com/dgilland/sqlservice#history
data_source_id = Column(db.Integer, db.ForeignKey("data_sources.id"), nullable=True) | ||
data_source = db.relationship(DataSource) | ||
latest_query_data_id = Column(db.Integer, db.ForeignKey("query_results.id"), nullable=True) | ||
latest_query_data = db.relationship(QueryResult) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need both latest_query_data_id
and latest_query_data
? (applies to all similar fields) SQLA doesn't have a convenience method to get the object id instead of loading the object itself otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQLA is a bit more explicit about this stuff; the foo_id
field is the actual db column, and foo
is the attribute for the related ORM object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I added a few comments, hopefully they're helpful... didn't have time to review it all.
init_db() | ||
db.session.commit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need the db.session.commit()
?
I assume create_db
or init_db
calls Flask-SQLAlchemy's create_all()
, which will implicitly call a commit:
http://stackoverflow.com/questions/34410091/flask-sqlalchemy-how-can-i-call-db-create-all-and-db-drop-all-without-trigg
class TimestampMixin(object): | ||
updated_at = Column(db.DateTime(True), default=db.func.now(), | ||
onupdate=db.func.now(), nullable=False) | ||
created_at = Column(db.DateTime(True), default=db.func.now(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should use server_default
?
class BelongsToOrgMixin(object): | ||
@classmethod | ||
def get_by_id_and_org(cls, object_id, org): | ||
return cls.get(cls.id == object_id, cls.org == org) | ||
return cls.query.filter(cls.id == object_id, cls.org == org).first() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this query ever return more than one result? If not, should probably use one()
or one_or_none()
@classmethod | ||
def get_by_slug(cls, slug): | ||
return cls.get(cls.slug == slug) | ||
return cls.query.filter(cls.slug == slug).first() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this should also be a one()
or one_or_none()
as I think you want errors if more than one result is returned for a given slug.
name = Column(db.String(100)) | ||
permissions = Column(postgresql.ARRAY(db.String(255)), | ||
default=DEFAULT_PERMISSIONS) | ||
created_at = Column(db.DateTime(True), default=db.func.now()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want server_default()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, but I don't want to change the schema until after I get things working as-is. (Hi Jeff! It's been a while! Never expected to see someone from SFSH commenting on my code :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I'm curios what's SFSH :-)
Fixes #1124 |
@washort I finished with my work on the frontend (for now) and want to give you a hand here. I rebased you branch with the latest master & fixed an issue with the settings/DATABASE_URL. Do you have some unpushed work or can I do a force push with these changes? |
I updated the tests code and now we get real failures, but still many tests fail just because the database runs out of connections. I tried to compare how we manage the connection/session in Calling engine#dispose (9f43542) seems to fix this, but is it the right usage? Why I haven't seen this in any other example? |
Another SQLA question: @classmethod
def get_by_id_and_org(cls, visualization_id, org):
return cls.query.join(Query).filter(cls.id == visualization_id, Query.org == org).one() With peewee I could pass to such method either Any middle ground? |
No, you have to match up the right value with the right attribute. This is only really an issue in unit tests though, I think, because as far as I can tell, in the rest of the code you should only be using object ids when they're in query parameters; the rest of the time you can just pass around objects. |
Working today on this branch was a reminder why I never liked SQLAlchemy in the first place :-\ It's very powerful, but why the simple stuff are so hard and verbose? |
@washort But I had to force push the result over your branch... I hope you didn't have anything uncommitted. |
Looks like we're experiencing test failures due to webpack not running in CircleCI. |
I'm not 100% sure about this. How will it work? Any reference implementation/documentation? |
All tests pass now (I changed configuration to run Webpack) 💯 But I greped the code for things like |
That's why I fixed the tests first -- to see where it'd be profitable to write more tests :-) Re bridge tables - this is what I was looking at: http://docs.sqlalchemy.org/en/latest/_modules/examples/generic_associations/table_per_related.html |
This will work for |
updated_at = Column(db.DateTime(True), default=db.func.now(), | ||
onupdate=db.func.now(), nullable=False) | ||
created_at = Column(db.DateTime(True), default=db.func.now(), | ||
nullable=False) | ||
|
||
|
||
class ChangeTrackingMixin(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about implementing this with a an before_update
or after_update
event?
It seems that SQLA has the tools to determine if something was changed in the event. I tried to experiment with it, but couldn't get the event to trigger. :\
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found out why the event didn't trigger and implemented it: e8739b3
If you have no comments, I will push this change to your branch.
The part I'm not happy about is how we deduce the user who changed the object, but I'm not sure it's that bad. Eventually this code is only relevant for the API, which is Flask based... and I added some safeguards to make sure it doesn't cause harm outside of Flask context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another issue with the way I set who changed the object is that it can't be changed :-\ Not a huge deal, mainly an issue in tests at the moment but feels wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have a look at this next. What was the issue with the way it's done now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly the fact that you need to call record_changes
and the manual "calculation" of what changed.
But apparently doing it in the after_insert
/after_update
events is wrong (SQLA complains about using Session.add
there) and using before_flush
also introduces its own challenges.
If I won't find a solution for this today, I will revert back to your version, apply record_changes
where needed and revisit this in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Yeah I didn't want to try to get too magic at this point, might be interesting to investigate later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always try to maintain balance between "magic" and "hassle" :-) At first it seemed like a good balance point here, but as this starts to become too complex, I think I will revert to the explicit version you had.
Otherwise we were running out of connections.
@washort I did some updates to the CLI tests:
There are still some (4?) tests failing because the CLI creates its own app_context/db session. I'm not sure how to solve this :-( One option is to create our own |
Also moved old migrations to old_migrations folder (before deleting them entirely).
Added Alembic (with Flask-Migrate). Updated the tasks to reflect integration status. |
|
||
from redash import models | ||
from redash.query_runner import query_runners | ||
from redash.query_runner import get_configuration_schema_for_query_runner_type | ||
from redash.utils.configuration import ConfigurationContainer | ||
|
||
manager = click.Group(help="Data sources management commands.") | ||
manager = AppGroup(help="Data sources management commands.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice, I didn't see that.
Aside from more fixes to broken functionality (if there is anything left) the only thing I want to add in this branch before merging is "Replace MeteredModel with SQLAlchemy timing events". All the rest can be a follow up IMO. |
Never mind, I was looking at |
Let's keep the functionality of overall timing per request and # of queries
executed. Until now this is the only infrormation I actuallly used.
…On Fri, 9 Dec 2016 at 20:52 Allen Short ***@***.***> wrote:
Had a look at the query-timing stuff available in SQLAlchemy - the main
difficulty with replicating the current behavior is that by the time
queries are executed, the only information available is the query text and
its parameters; model class, method name etc. aren't accessible.
Any thoughts on how you want to handle this? Obviously we can parse the
SQL to retrieve table names and operation type, if that's the path you
prefer.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1417 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEXLHmXF6vHBUjYXykdGYgYyNoXGASCks5rGaNzgaJpZM4K4WHX>
.
|
In that case I think we're done. I still want to change the schema a bit but that can happen in a new branch. |
I've added some more metrics and .... it's merged! :) |
Major congrats you two, this was a lot of work! PS: @washort indeed surprised to see you on here. Hope you and fam are well. This is kinda like the inverse of when a IRL coworker told me he was googling something and found what he needed on SO, then realized I'd written the answer. @arikfr SFSH is a mailing list of a loosely affiliated group of folks, many (but not all) of whom attend gracepres.com. |
is_draft
one)old_migrations
(might just delete it)alembic stamp head
when creating the DB at the first time.MeteredModel
with SQLAlchemy timing events