Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data migrations will fail if subsequent versions add columns to back-referenced tables #1500

Closed
cfm opened this issue May 27, 2022 · 1 comment · Fixed by #1517
Closed

data migrations will fail if subsequent versions add columns to back-referenced tables #1500

cfm opened this issue May 27, 2022 · 1 comment · Fixed by #1517
Assignees
Labels
bug Something isn't working database SQLite, SQLAlchemy, and data model

Comments

@cfm
Copy link
Member

cfm commented May 27, 2022

Description

If an Alembic version A has an associated data migration (or test) that queries a back-referenced table T, and a subsequent Alembic version B adds a column T.C, A's data migration (or test) will start to fail.

Steps to Reproduce

  1. Apply the diff:
diff --git a/alembic/versions/3b39e60f71b6_test.py b/alembic/versions/3b39e60f71b6_test.py
new file mode 100644
index 0000000..2a16cc2
--- /dev/null
+++ b/alembic/versions/3b39e60f71b6_test.py
@@ -0,0 +1,25 @@
+"""test
+
+Revision ID: 3b39e60f71b6
+Revises: 9ba8d7524871
+Create Date: 2022-05-26 17:20:59.611862
+
+"""
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = '3b39e60f71b6'
+down_revision = '9ba8d7524871'
+branch_labels = None
+depends_on = None
+
+
+def upgrade():
+    op.add_column('messages', sa.Column('test', sa.Boolean(), nullable=True))
+
+
+def downgrade():
+    with op.batch_alter_table("messages", schema=None) as batch_op:
+        batch_op.drop_column('test')
diff --git a/securedrop_client/db.py b/securedrop_client/db.py
index a3c213a..1c880e7 100644
--- a/securedrop_client/db.py
+++ b/securedrop_client/db.py
@@ -212,6 +212,7 @@ class Message(Base):
     file_counter = Column(Integer, nullable=False)
     size = Column(Integer, nullable=False)
     download_url = Column(String(255), nullable=False)
+    test = Column(Boolean)
 
     # This is whether the submission has been downloaded in the local database.
     is_downloaded = Column(Boolean(name="is_downloaded"), nullable=False, server_default=text("0"))
  1. TESTS=tests/test_alembic.py make test

Expected Behavior

The test suite passes.

Actual Behavior

Data-migration tests fail:

[...]
self = <sqlalchemy.dialects.sqlite.pysqlite.SQLiteDialect_pysqlite object at 0x7dc05eb7d978>
cursor = <sqlite3.Cursor object at 0x7dc05e87ff80>
statement = 'SELECT messages.id AS messages_id, messages.uuid AS messages_uuid, messages.filename AS messages_filename, messages.f...id, messages.last_updated AS messages_last_updated \nFROM messages \nWHERE ? = messages.source_id ORDER BY messages.id'
parameters = (1,)
context = <sqlalchemy.dialects.sqlite.base.SQLiteExecutionContext object at 0x7dc05e8d18d0>

    def do_execute(self, cursor, statement, parameters, context=None):
>       cursor.execute(statement, parameters)
E       sqlite3.OperationalError: no such column: messages.test

.venv/lib/python3.7/site-packages/sqlalchemy/engine/default.py:552: OperationalError

The above exception was the direct cause of the following exception:

alembic_config = '/tmp/sdc-o2looq6b/alembic.ini', config = '/tmp/sdc-o2looq6b/config.json'
migration = 'bd57477f19a2', homedir = '/tmp/sdc-o2looq6b'

    @pytest.mark.parametrize("migration", DATA_MIGRATIONS)
    def test_alembic_migration_downgrade_with_data(alembic_config, config, migration, homedir):
        """
        Upgrade to the target migration, load data, then downgrade in order to test that the downgrade
        is successful when there is data.
        """
        upgrade(alembic_config, migration)
        mod_name = "tests.migrations.test_{}".format(migration)
        mod = __import__(mod_name, fromlist=["DowngradeTester"])
        downgrade_tester = mod.DowngradeTester(homedir)
>       downgrade_tester.load_data()

tests/test_alembic.py:181: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/migrations/test_bd57477f19a2.py:167: in load_data
    add_file(self.session, source_id)
tests/migrations/utils.py:100: in add_file
    file_counter = len(source.collection) + 1
securedrop_client/db.py:119: in collection
    collection.extend(self.messages)
[...]
FAILED tests/test_alembic.py::test_alembic_migration_upgrade_with_data[bd57477f19a2] - sqlalchemy.exc...
FAILED tests/test_alembic.py::test_alembic_migration_upgrade_with_data[a4bf1f58ce69] - sqlalchemy.exc...
FAILED tests/test_alembic.py::test_alembic_migration_downgrade_with_data[bd57477f19a2] - sqlalchemy.e...
[...]

Comments

When A's data migrations run, B's new column T.C is defined in the SQLAlchemy model and therefore included in SQLAlchemy's queries. But because A's data migrations run (by definition) before B has been applied, C is not present in T. In other words, at data-migration time, the model definition (from Git head) is ahead of the table schema (from Alembic head).

@cfm cfm added bug Something isn't working database SQLite, SQLAlchemy, and data model labels May 27, 2022
@cfm
Copy link
Member Author

cfm commented Jun 7, 2022

Thanks to @legoktm for confirming that it's an anti-pattern to use SQLAlchemy models in Alembic data migrations. In discussion today, however, we discovered that we can safely squash all our migrations to date, and eliminate the data migrations (and their fixtures) at issue here, in the version we'll release for Qubes 4.1, for which a full reinstallation is required. Going forward, we can document that Alembic data migrations MUST NOT refer to SQLAlchemy models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working database SQLite, SQLAlchemy, and data model
Projects
None yet
1 participant