Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V52 database migration breaks existing Marquez installations #2298

Closed
collado-mike opened this issue Dec 8, 2022 · 2 comments · Fixed by #2308
Closed

V52 database migration breaks existing Marquez installations #2298

collado-mike opened this issue Dec 8, 2022 · 2 comments · Fixed by #2308

Comments

@collado-mike
Copy link
Collaborator

The migration introduced in #2217 is not backward compatible with existing marquez installations. The following exception is thrown during the flyway migration.
Database migrations can't change the types of columns in existing views in order to change the type, it is necessary to drop the existing view, then alter the columns, and recreate the view.

Marquez is missing a way of testing these migrations against existing installations. We should figure out a way to validate these scripts before merging to main.

{"exception":"org.postgresql.util.PSQLException: ERROR: cannot alter type of a column used by a view or rule
  Detail: rule _RETURN on view datasets_view depends on column \"name\"
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2676)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:356)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:496)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:413)
	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:333)
	at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:319)
	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:295)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:290)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.apache.tomcat.jdbc.pool.StatementFacade$StatementProxy.invoke(StatementFacade.java:118)
	at jdk.proxy2/jdk.proxy2.$Proxy47.execute(Unknown Source)
	at org.flywaydb.core.internal.jdbc.JdbcTemplate.executeStatement(JdbcTemplate.java:201)
	at org.flywaydb.core.internal.sqlscript.ParsedSqlStatement.execute(ParsedSqlStatement.java:95)
	at org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.executeStatement(DefaultSqlScriptExecutor.java:210)
	... 28 common frames omitted
Wrapped by: org.flywaydb.core.internal.sqlscript.FlywaySqlScriptException: Migration V52__alter_dataset_symlinks.sql failed
------------------------------------------------
SQL State  : 0A000
Error Code : 0
Message    : ERROR: cannot alter type of a column used by a view or rule
  Detail: rule _RETURN on view datasets_view depends on column \"name\"
Location   : marquez/db/migration/V52__alter_dataset_symlinks.sql (/usr/src/app/file:/usr/src/app/datakin-api.jar!/marquez/db/migration/V52__alter_dataset_symlinks.sql)
Line       : 2
Statement  : /* SPDX-License-Identifier: Apache-2.0 */
ALTER TABLE dataset_symlinks ALTER COLUMN name TYPE VARCHAR

	at org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.handleException(DefaultSqlScriptExecutor.java:275)
	at org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.executeStatement(DefaultSqlScriptExecutor.java:222)
	at org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.execute(DefaultSqlScriptExecutor.java:126)
	at org.flywaydb.core.internal.resolver.sql.SqlMigrationExecutor.executeOnce(SqlMigrationExecutor.java:69)
	at org.flywaydb.core.internal.resolver.sql.SqlMigrationExecutor.lambda$execute$0(SqlMigrationExecutor.java:58)
	at org.flywaydb.core.internal.database.DefaultExecutionStrategy.execute(DefaultExecutionStrategy.java:27)
	at org.flywaydb.core.internal.resolver.sql.SqlMigrationExecutor.execute(SqlMigrationExecutor.java:57)
	at org.flywaydb.core.internal.command.DbMigrate.doMigrateGroup(DbMigrate.java:377)
	... 22 common frames omitted
Wrapped by: org.flywaydb.core.internal.command.DbMigrate$FlywayMigrateException: Migration V52__alter_dataset_symlinks.sql failed
------------------------------------------------
SQL State  : 0A000
Error Code : 0
Message    : ERROR: cannot alter type of a column used by a view or rule
  Detail: rule _RETURN on view datasets_view depends on column \"name\"
Location   : marquez/db/migration/V52__alter_dataset_symlinks.sql (/usr/src/app/file:/usr/src/app/datakin-api.jar!/marquez/db/migration/V52__alter_dataset_symlinks.sql)
Line       : 2
Statement  : /* SPDX-License-Identifier: Apache-2.0 */
ALTER TABLE dataset_symlinks ALTER COLUMN name TYPE VARCHAR

	at org.flywaydb.core.internal.command.DbMigrate.doMigrateGroup(DbMigrate.java:385)
	at org.flywaydb.core.internal.command.DbMigrate.lambda$applyMigrations$1(DbMigrate.java:275)
	at org.flywaydb.core.internal.jdbc.TransactionalExecutionTemplate.execute(TransactionalExecutionTemplate.java:55)
	at org.flywaydb.core.internal.command.DbMigrate.applyMigrations(DbMigrate.java:274)
	at org.flywaydb.core.internal.command.DbMigrate.migrateGroup(DbMigrate.java:247)
	at org.flywaydb.core.internal.command.DbMigrate.lambda$migrateAll$0(DbMigrate.java:141)
	at org.flywaydb.core.internal.database.postgresql.PostgreSQLAdvisoryLockTemplate.execute(PostgreSQLAdvisoryLockTemplate.java:69)
	at org.flywaydb.core.internal.database.postgresql.PostgreSQLConnection.lock(PostgreSQLConnection.java:99)
	at org.flywaydb.core.internal.schemahistory.JdbcTableSchemaHistory.lock(JdbcTableSchemaHistory.java:139)
	at org.flywaydb.core.internal.command.DbMigrate.migrateAll(DbMigrate.java:141)
	at org.flywaydb.core.internal.command.DbMigrate.migrate(DbMigrate.java:98)
	at org.flywaydb.core.Flyway$1.execute(Flyway.java:172)
	at org.flywaydb.core.Flyway$1.execute(Flyway.java:124)
	at org.flywaydb.core.FlywayExecutor.execute(FlywayExecutor.java:205)
	at org.flywaydb.core.Flyway.migrate(Flyway.java:124)
	at marquez.db.DbMigration.migrateDbOrError(DbMigration.java:38)
	at com.datakin.DatakinApp.run(DatakinApp.java:216)
	at com.datakin.DatakinApp.run(DatakinApp.java:90)
	at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:67)
	at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:98)
	at io.dropwizard.cli.Cli.run(Cli.java:78)
	at io.dropwizard.Application.run(Application.java:94)
	at com.datakin.DatakinApp.main(DatakinApp.java:109)
","thread":"main","message":"Stopping app...","level":"ERROR","timestamp":1670462767756,"logger":"com.datakin.DatakinApp"}
@pawel-big-lebowski
Copy link
Collaborator

pawel-big-lebowski commented Dec 8, 2022

The problem here are the repeatable migrations https://flywaydb.org/documentation/concepts/migrations.html#repeatable-migrations which according to the docs:

Within a single migration run, repeatable migrations are always applied last, after all pending versioned migrations have been executed.

This in fact means that we do not test it properly as our clean docker environments do not run repeatable migrations before the versioned migrations. In other words, our tests never experienced scenario with a view being present which is a cause of this bug.

Solution to this could be:

  • include drop view datasets_view at the beginning of V52__alter_dataset_symlinks.sql,
  • drop within the script all the other views to avoid such errors in future

I agree we need a more extensive tests that test migration at different states.

@collado-mike
Copy link
Collaborator Author

A possible approach would be to spin up an existing marquez docker image (like the last release or something), then try to apply flyway migration to the database afterward. That should catch at least structural incompatibilities (it won't catch issues in the data).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants