Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Native JSON type instead of LONGTEXT for MariaDB #5100

Closed
wants to merge 4 commits into from

Conversation

alexander-schranz
Copy link

Q A
Type bug/feature/improvement
BC Break yes/no
Fixed issues fixes #3202

Summary

Use JSON type instead of LONGTEXT for MariaDB. See #3202

@@ -19,7 +19,7 @@ class MariaDb1027Platform extends MySQLPlatform
*/
public function getJsonTypeDeclarationSQL(array $column): string
{
return 'LONGTEXT';
return 'JSON';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a BC-break, but maybe you just wanted to validate that tests stay green with this change? If yes, please set the PR status to draft until it is finished.

As advised, I think a platform option would allow people to switch to JSON while retaining backwards-compatibility for others. What do you think?

Copy link
Author

@alexander-schranz alexander-schranz Dec 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

platform option

Do you have an example for a platform option and how they could be used here? How are platform options configured? Are the queries params in the dsn?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a BC-break […]

In this regard, the mappings are internal implementation details of the DBAL, so whether it's a BC break, depends on whether the DBAL can migrate the schema from the old mapping to the new one seamlessly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change seems to have no effect for doctrine:schema:update or bin/console doctrine:migrations:diff. As for the diff:

`data` longtext COLLATE utf8mb4_unicode_ci NOT NULL COMMENT '(DC2Type:json)',

is the same as the new schema with:

`data` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`new_data`)),

So this would be no BC Break. If somebody want to upgrade from (DC2Type:json) to native mariadb JSON with json_valid check they need to do it manually or drop the column first.

Copy link
Member

@morozov morozov Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change seems to have no effect for doctrine:schema:update or bin/console doctrine:migrations:diff.

@alexander-schranz could you provide some more details on how you used these commands? This way, we could reproduce your scenario and make sure that the necessary cases have been tested.

Copy link
Author

@alexander-schranz alexander-schranz Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@morozov I don't see this behaviour as an issue.

I have an exist database with a table which looks like this (output of: SHOW CREATE TABLE app_activity):

CREATE TABLE `app_activity` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `data` longtext COLLATE utf8mb4_unicode_ci NOT NULL COMMENT '(DC2Type:json)',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

And entity which looks like this:

/**
 * @ORM\Entity
 * @ORM\Table(name="app_activity")
 */
class Activity
{
    /**
     * @ORM\Id
     * @ORM\Column(type="integer")
     * @ORM\GeneratedValue
     */
    private int $id;
    
    /**
     * @ORM\Column(type="json")
     *
     * @var mixed[]
     */
    private array $$data;
}

And when then use this current branch of doctrine/dbal the bin/console doctrine:schema:update --dump-sql or bin/console doctrine:migrations:diff does not change anything about this database. As from DBAL perspective the columns is already json (this is because of (DC2Type:json)).

Only when dropping the table/database and creating the schema again it will create it with the as native json type:

CREATE TABLE app_activity (
   id INT AUTO_INCREMENT NOT NULL,
   data JSON NOT NULL,
   PRIMARY KEY(id)
) DEFAULT CHARACTER SET utf8mb4 COLLATE `utf8mb4_unicode_ci` ENGINE = InnoDB;

Output of: SHOW CREATE TABLE app_activity is here correctly with the json_valid method then the JSON is still a longtext here:

CREATE TABLE `app_activity` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `data` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`data`)),
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

As written above I don't see this as an issue. It will just create future json fields with json_valid method. And so it will not be breaking change as it will not effect exist tables.

Copy link
Member

@morozov morozov Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexander-schranz I may have said it wrong, I didn't imply that there is an issue.

From what you described, the current behavior looks right: as long as the new DBAL version doesn't produce an invalid diff while comparing the local schema with the one deployed by the old version, it should be fine.

I assume that the tools you used don't leverage the schema comparison improvement implemented in #4746. This way, the comparator compares only the column types and considers them equal.

I suspect that once provided a database platform, the comparator will detect a diff. The question is, will the SQL the target platform builds for this diff be valid. I can check that myself.

In the meantime, please rebase against 3.3.x since it's an improvement, not a bug fix.

Copy link
Member

@morozov morozov Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration works well using both the old and the new comparison logic. But the introspection of the new column doesn't work properly (tested on MariaDB 10.3). Consider this:

$column = new Column('data', new JsonType(), []);
$localTable = new Table('test', [$column]);

$sm = $connection->createSchemaManager();
$sm->dropAndCreateTable($localTable);
$deployedTable = $sm->createSchema()
    ->getTable('test');
$comparator = $sm->createComparator();
$diff = $comparator->diffTable($deployedTable, $localTable);
$sql = $sm->getDatabasePlatform()->getAlterTableSQL($diff);
var_dump($sql);
// ALTER TABLE test CHANGE data data JSON NOT NULL

Since JSON is just an alias for LONGTEXT, the column is introspected as LONGTEXT which causes a false-positive diff.

@alexander-schranz alexander-schranz marked this pull request as draft December 7, 2021 12:40
@alexander-schranz alexander-schranz changed the title Use JSON type instead of LONGTEXT for MariaDB Use Native JSON type instead of LONGTEXT for MariaDB Dec 8, 2021
@alexander-schranz alexander-schranz marked this pull request as ready for review December 8, 2021 03:14
@derrabus derrabus changed the base branch from 3.2.x to 3.3.x January 18, 2022 00:42
@morozov
Copy link
Member

morozov commented Apr 13, 2022

Closing due to the lack of progress.

@morozov morozov closed this Apr 13, 2022
cgknx added a commit to cgknx/dbal that referenced this pull request Feb 9, 2023
MariaDb aliases columns specified as JSON to LONGTEXT. Since version 10.4.3,
MariaDb adds a CHECK constraint to JSON columns to ensure they contain valid
JSON and sets the column collation to utf8mb4_bin.

Simply setting JSON as the column type in the platform results in introspection
failures as the database reports LONGTEXT (not JSON) with changed collation.

Therefore, inverse the aliasing where the relevant check constraint exists and
ignore collation changes for JSON columns in the relevant database versions.

The modified methods are MariaDBPlatform->getListTableColumnsSQL() and
MySQLSchemaManager->selectTableColumns()

The changes have been implemented in the MariaDBPlatform (guarded by a check
for JSON as column type so they do not affect MariaDB1027Platform) and
MySQLSchemaManager (guarded by a platform test). It ought to be possible to
push most of the platform changes into the MySQLPlatform given the SQL query
changes are guarded by MariaDB specific executable comments and so should be a
no op for MySQL. Doing so would allow getDatabaseNameSQL() to be kept private.
However, because the changes are entirely MariaDb specific they have been made
in the MariaDBPlatform (this is also consistent with the plan to separate
MariaDb and MySQL platforms).

MariaDb1043Platform extends MariaDb1027Platform rather than MariaDBPlatform to
ensure the platform checks in MySQLSchemaManager (based on MariaDb1027Platform)
continue to work as expected.

The alternative approach of adding the CHECK constraint to the SQL for the
column is not robust as MariaDb syntax seems to require the CHECK constraint to
be the final part of the column declaration and it was not obvious how to
guarantee this (specifically, appending the CHECK constraint to the output of
getJsonTypeDeclarationSQL() did not work).

New public methods:
MariaDBPlatform->getInverseJsonToLongtextAliasSQL() generates SQL snippets to
inverse the JSON to LONGTEXT aliasing for JSON-specified columns. It is marked
public because it is used in MySQLSchemaManager.

Overridden methods:
MariaDb1043Platform->getJsonTypeDeclarationSQL(). To return JSON.

MariaDBPlatform->getListTableColumnsSQL(). To reverse the JSON to LONGTEXT
                                           aliasing carried out by MariaDb.

MariaDBPlatform->MariaDBColumnToArray(). To unset collation and charset in
                                         column array (so parent needed to be
                                         marked protected).

New test cases:
1. Types/JsonTest. Test storage and retrieval of json data.
2. Platforms/MariaDb1043PlatformTest. A clone of MariaDb1027Platform test.
3. Schema/MySQLSchemaManagerTest->testColumnIntrospection(). Ensures
   introspected table matches original table. Tests all doctrine types not
   just json. Based on failure highlighted in pull request doctrine#5100.

For previous discussion on this topic, see:
doctrine#5100
doctrine#3202

Further background at:
https://mariadb.com/kb/en/json-data-type/
https://mariadb.com/kb/en/information-schema-check_constraints-table/
https://jira.mariadb.org/browse/MDEV-13916
cgknx added a commit to cgknx/dbal that referenced this pull request Feb 15, 2023
MariaDb aliases columns specified as JSON to LONGTEXT. Since version 10.4.3,
MariaDb adds a CHECK constraint to JSON columns to ensure they contain valid
JSON and sets the column collation to utf8mb4_bin.

Simply setting JSON as the column type in the platform results in introspection
failures as the database reports LONGTEXT (not JSON) with changed collation.

Therefore, inverse the aliasing where the relevant check constraint exists and
ignore collation changes for JSON columns in the relevant database versions.

The modified methods are MariaDBPlatform->getListTableColumnsSQL() and
MySQLSchemaManager->selectTableColumns()

The changes have been implemented in the MariaDBPlatform (guarded by a check
for JSON as column type so they do not affect MariaDB1027Platform) and
MySQLSchemaManager (guarded by a platform test). It ought to be possible to
push most of the platform changes into the MySQLPlatform given the SQL query
changes are guarded by MariaDB specific executable comments and so should be a
no op for MySQL. Doing so would allow getDatabaseNameSQL() to be kept private.
However, because the changes are entirely MariaDb specific they have been made
in the MariaDBPlatform (this is also consistent with the plan to separate
MariaDb and MySQL platforms).

MariaDb1043Platform extends MariaDb1027Platform rather than MariaDBPlatform to
ensure the platform checks in MySQLSchemaManager (based on MariaDb1027Platform)
continue to work as expected.

The alternative approach of adding the CHECK constraint to the SQL for the
column is not robust as MariaDb syntax seems to require the CHECK constraint to
be the final part of the column declaration and it was not obvious how to
guarantee this (specifically, appending the CHECK constraint to the output of
getJsonTypeDeclarationSQL() did not work).

New public methods:
MariaDBPlatform->getInverseJsonToLongtextAliasSQL() generates SQL snippets to
inverse the JSON to LONGTEXT aliasing for JSON-specified columns. It is marked
public because it is used in MySQLSchemaManager.

Overridden methods:
MariaDb1043Platform->getJsonTypeDeclarationSQL(). To return JSON.

MariaDBPlatform->getListTableColumnsSQL(). To reverse the JSON to LONGTEXT
                                           aliasing carried out by MariaDb.

MariaDBPlatform->MariaDBColumnToArray(). To unset collation and charset in
                                         column array (so parent needed to be
                                         marked protected).

New test cases:
1. Types/JsonTest. Test storage and retrieval of json data.
2. Platforms/MariaDb1043PlatformTest. A clone of MariaDb1027Platform test.
3. Schema/MySQLSchemaManagerTest->testColumnIntrospection(). Ensures
   introspected table matches original table. Tests all doctrine types not
   just json. Based on failure highlighted in pull request doctrine#5100.

For previous discussion on this topic, see:
doctrine#5100
doctrine#3202

Further background at:
https://mariadb.com/kb/en/json-data-type/
https://mariadb.com/kb/en/information-schema-check_constraints-table/
https://jira.mariadb.org/browse/MDEV-13916
cgknx added a commit to cgknx/dbal that referenced this pull request Feb 15, 2023
MariaDb aliases columns specified as JSON to LONGTEXT. Since version 10.4.3,
MariaDb adds a CHECK constraint to JSON columns to ensure they contain valid
JSON and sets the column collation to utf8mb4_bin.

Simply setting JSON as the column type in the platform results in introspection
failures as the database reports LONGTEXT (not JSON) with changed collation.

Therefore, inverse the aliasing where the relevant check constraint exists and
ignore collation changes for JSON columns in the relevant database versions.

The modified methods are MariaDBPlatform->getListTableColumnsSQL() and
MySQLSchemaManager->selectTableColumns()

The changes have been implemented in the MariaDBPlatform (guarded by a check
for JSON as column type so they do not affect MariaDB1027Platform) and
MySQLSchemaManager (guarded by a platform test). It ought to be possible to
push most of the platform changes into the MySQLPlatform given the SQL query
changes are guarded by MariaDB specific executable comments and so should be a
no op for MySQL. Doing so would allow getDatabaseNameSQL() to be kept private.
However, because the changes are entirely MariaDb specific they have been made
in the MariaDBPlatform (this is also consistent with the plan to separate
MariaDb and MySQL platforms).

MariaDb1043Platform extends MariaDb1027Platform rather than MariaDBPlatform to
ensure the platform checks in MySQLSchemaManager (based on MariaDb1027Platform)
continue to work as expected.

The alternative approach of adding the CHECK constraint to the SQL for the
column is not robust as MariaDb syntax seems to require the CHECK constraint to
be the final part of the column declaration and it was not obvious how to
guarantee this (specifically, appending the CHECK constraint to the output of
getJsonTypeDeclarationSQL() did not work).

New public methods:
MariaDBPlatform->getInverseJsonToLongtextAliasSQL() generates SQL snippets to
inverse the JSON to LONGTEXT aliasing for JSON-specified columns. It is marked
public because it is used in MySQLSchemaManager.

Overridden methods:
MariaDb1043Platform->getJsonTypeDeclarationSQL(). To return JSON.

MariaDBPlatform->getListTableColumnsSQL(). To reverse the JSON to LONGTEXT
                                           aliasing carried out by MariaDb.

MariaDBPlatform->MariaDBColumnToArray(). To unset collation and charset in
                                         column array (so parent needed to be
                                         marked protected).

New test cases:
1. Types/JsonTest. Test storage and retrieval of json data.
2. Platforms/MariaDb1043PlatformTest. A clone of MariaDb1027Platform test.
3. Schema/MySQLSchemaManagerTest->testColumnIntrospection(). Ensures
   introspected table matches original table. Tests all doctrine types not
   just json. Based on failure highlighted in pull request doctrine#5100.

For previous discussion on this topic, see:
doctrine#5100
doctrine#3202

Further background at:
https://mariadb.com/kb/en/json-data-type/
https://mariadb.com/kb/en/information-schema-check_constraints-table/
https://jira.mariadb.org/browse/MDEV-13916

v2 of JsonTest.php:
1. Simplify array shapes of json structures.

2. Deal with json returned as a stream not string.

   It seems Oracle 11 returns json as a stream so convert streams to
   strings.
cgknx added a commit to cgknx/dbal that referenced this pull request Feb 16, 2023
MariaDb aliases columns specified as JSON to LONGTEXT. Since version 10.4.3,
MariaDb adds a CHECK constraint to JSON columns to ensure they contain valid
JSON and sets the column collation to utf8mb4_bin.

Simply setting JSON as the column type in the platform results in introspection
failures as the database reports LONGTEXT (not JSON) with changed collation.

Therefore, inverse the aliasing where the relevant check constraint exists and
ignore collation changes for JSON columns in the relevant database versions.

The modified methods are MariaDBPlatform->getListTableColumnsSQL() and
MySQLSchemaManager->selectTableColumns()

The changes have been implemented in the MariaDBPlatform (guarded by a check
for JSON as column type so they do not affect MariaDB1027Platform) and
MySQLSchemaManager (guarded by a platform test). It ought to be possible to
push most of the platform changes into the MySQLPlatform given the SQL query
changes are guarded by MariaDB specific executable comments and so should be a
no op for MySQL. Doing so would allow getDatabaseNameSQL() to be kept private.
However, because the changes are entirely MariaDb specific they have been made
in the MariaDBPlatform (this is also consistent with the plan to separate
MariaDb and MySQL platforms).

MariaDb1043Platform extends MariaDb1027Platform rather than MariaDBPlatform to
ensure the platform checks in MySQLSchemaManager (based on MariaDb1027Platform)
continue to work as expected.

The alternative approach of adding the CHECK constraint to the SQL for the
column is not robust as MariaDb syntax seems to require the CHECK constraint to
be the final part of the column declaration and it was not obvious how to
guarantee this (specifically, appending the CHECK constraint to the output of
getJsonTypeDeclarationSQL() did not work).

New public methods:
MariaDBPlatform->getInverseJsonToLongtextAliasSQL() generates SQL snippets to
inverse the JSON to LONGTEXT aliasing for JSON-specified columns. It is marked
public because it is used in MySQLSchemaManager.

Overridden methods:
MariaDb1043Platform->getJsonTypeDeclarationSQL(). To return JSON.

MariaDBPlatform->getListTableColumnsSQL(). To reverse the JSON to LONGTEXT
                                           aliasing carried out by MariaDb.

MariaDBPlatform->MariaDBColumnToArray(). To unset collation and charset in
                                         column array (so parent needed to be
                                         marked protected).

New test cases:
1. Types/JsonTest. Test storage and retrieval of json data.
2. Platforms/MariaDb1043PlatformTest. A clone of MariaDb1027Platform test.
3. Schema/MySQLSchemaManagerTest->testColumnIntrospection(). Ensures
   introspected table matches original table. Tests all doctrine types not
   just json. Based on failure highlighted in pull request doctrine#5100.

For previous discussion on this topic, see:
doctrine#5100
doctrine#3202

Further background at:
https://mariadb.com/kb/en/json-data-type/
https://mariadb.com/kb/en/information-schema-check_constraints-table/
https://jira.mariadb.org/browse/MDEV-13916

v2
JsonTest.php:
1. Simplify array shapes of json structures.

2. Deal with json returned as a stream not string.

   It seems Oracle 11 returns json as a stream so convert streams to
   strings.

v3
JsonTest.php:

1. 'json_table' is a reserved word in MySQL (and a function name in
   Oracle SQL). Use json_test_table instead.

2. Some databases e.g. MySQL reorder json arrays so the insert order
   will not necessarily match the select order. Resort the arrays
   before testing for equality.

MySQLSchemaManager.php:

Remove 'instanceof MariaDb1027Platform' test as follows:

a. Add getColumnTypeSQLSnippets() method to AbstractMySQLPlatform which
   returns the default snippets for all databases except MariaDb.

b. Override the method in MariDBPlatform to return the SQL snippets
   which inverse the JSON to LONGTEXT aliasing.

Note that the getColumnTypeSQLSnippets() method in MariaDBPlatform was
previously called getInverseJsonToLongtextAliasSQL() and was renamed to
make more sense for MySQL platforms in general.
cgknx added a commit to cgknx/dbal that referenced this pull request Feb 20, 2023
MariaDb aliases columns specified as JSON to LONGTEXT. Since version 10.4.3,
MariaDb adds a CHECK constraint to JSON columns to ensure they contain valid
JSON and sets the column collation to utf8mb4_bin.

Simply setting JSON as the column type in the platform results in introspection
failures as the database reports LONGTEXT (not JSON) with changed collation.

Therefore, inverse the aliasing where the relevant check constraint exists and
ignore collation changes for JSON columns in the relevant database versions.

The modified methods are MariaDBPlatform->getListTableColumnsSQL() and
MySQLSchemaManager->selectTableColumns()

The changes have been implemented in the MariaDBPlatform (guarded by a check
for JSON as column type so they do not affect MariaDB1027Platform) and
MySQLSchemaManager (guarded by a platform test). It ought to be possible to
push most of the platform changes into the MySQLPlatform given the SQL query
changes are guarded by MariaDB specific executable comments and so should be a
no op for MySQL. Doing so would allow getDatabaseNameSQL() to be kept private.
However, because the changes are entirely MariaDb specific they have been made
in the MariaDBPlatform (this is also consistent with the plan to separate
MariaDb and MySQL platforms).

MariaDb1043Platform extends MariaDb1027Platform rather than MariaDBPlatform to
ensure the platform checks in MySQLSchemaManager (based on MariaDb1027Platform)
continue to work as expected.

The alternative approach of adding the CHECK constraint to the SQL for the
column is not robust as MariaDb syntax seems to require the CHECK constraint to
be the final part of the column declaration and it was not obvious how to
guarantee this (specifically, appending the CHECK constraint to the output of
getJsonTypeDeclarationSQL() did not work).

New public methods:
MariaDBPlatform->getInverseJsonToLongtextAliasSQL() generates SQL snippets to
inverse the JSON to LONGTEXT aliasing for JSON-specified columns. It is marked
public because it is used in MySQLSchemaManager.

Overridden methods:
MariaDb1043Platform->getJsonTypeDeclarationSQL(). To return JSON.

MariaDBPlatform->getListTableColumnsSQL(). To reverse the JSON to LONGTEXT
                                           aliasing carried out by MariaDb.

MariaDBPlatform->MariaDBColumnToArray(). To unset collation and charset in
                                         column array (so parent needed to be
                                         marked protected).

New test cases:
1. Types/JsonTest. Test storage and retrieval of json data.
2. Platforms/MariaDb1043PlatformTest. A clone of MariaDb1027Platform test.
3. Schema/MySQLSchemaManagerTest->testColumnIntrospection(). Ensures
   introspected table matches original table. Tests all doctrine types not
   just json. Based on failure highlighted in pull request doctrine#5100.

For previous discussion on this topic, see:
doctrine#5100
doctrine#3202

Further background at:
https://mariadb.com/kb/en/json-data-type/
https://mariadb.com/kb/en/information-schema-check_constraints-table/
https://jira.mariadb.org/browse/MDEV-13916

v2
JsonTest.php:
1. Simplify array shapes of json structures.

2. Deal with json returned as a stream not string.

   It seems Oracle 11 returns json as a stream so convert streams to
   strings.

v3
JsonTest.php:

1. 'json_table' is a reserved word in MySQL (and a function name in
   Oracle SQL). Use json_test_table instead.

2. Some databases e.g. MySQL reorder json arrays so the insert order
   will not necessarily match the select order. Resort the arrays
   before testing for equality.

MySQLSchemaManager.php:

Remove 'instanceof MariaDb1027Platform' test as follows:

a. Add getColumnTypeSQLSnippets() method to AbstractMySQLPlatform which
   returns the default snippets for all databases except MariaDb.

b. Override the method in MariDBPlatform to return the SQL snippets
   which inverse the JSON to LONGTEXT aliasing.

Note that the getColumnTypeSQLSnippets() method in MariaDBPlatform was
previously called getInverseJsonToLongtextAliasSQL() and was renamed to
make more sense for MySQL platforms in general.
cgknx added a commit to cgknx/dbal that referenced this pull request Mar 4, 2023
MariaDb aliases columns specified as JSON to LONGTEXT. Since version 10.4.3,
MariaDb adds a CHECK constraint to JSON columns to ensure they contain valid
JSON and sets the column collation to utf8mb4_bin.

Simply setting JSON as the column type in the platform results in introspection
failures as the database reports LONGTEXT (not JSON) with changed collation.

Therefore, inverse the aliasing where the relevant check constraint exists and
ignore collation changes for JSON columns in the relevant database versions.

The modified methods are MariaDBPlatform->getListTableColumnsSQL() and
MySQLSchemaManager->selectTableColumns()

The changes have been implemented in the MariaDBPlatform (guarded by a check
for JSON as column type so they do not affect MariaDB1027Platform) and
MySQLSchemaManager (guarded by a platform test). It ought to be possible to
push most of the platform changes into the MySQLPlatform given the SQL query
changes are guarded by MariaDB specific executable comments and so should be a
no op for MySQL. Doing so would allow getDatabaseNameSQL() to be kept private.
However, because the changes are entirely MariaDb specific they have been made
in the MariaDBPlatform (this is also consistent with the plan to separate
MariaDb and MySQL platforms).

MariaDb1043Platform extends MariaDb1027Platform rather than MariaDBPlatform to
ensure the platform checks in MySQLSchemaManager (based on MariaDb1027Platform)
continue to work as expected.

The alternative approach of adding the CHECK constraint to the SQL for the
column is not robust as MariaDb syntax seems to require the CHECK constraint to
be the final part of the column declaration and it was not obvious how to
guarantee this (specifically, appending the CHECK constraint to the output of
getJsonTypeDeclarationSQL() did not work).

New public methods:
MariaDBPlatform->getInverseJsonToLongtextAliasSQL() generates SQL snippets to
inverse the JSON to LONGTEXT aliasing for JSON-specified columns. It is marked
public because it is used in MySQLSchemaManager.

Overridden methods:
MariaDb1043Platform->getJsonTypeDeclarationSQL(). To return JSON.

MariaDBPlatform->getListTableColumnsSQL(). To reverse the JSON to LONGTEXT
                                           aliasing carried out by MariaDb.

MariaDBPlatform->MariaDBColumnToArray(). To unset collation and charset in
                                         column array (so parent needed to be
                                         marked protected).

New test cases:
1. Types/JsonTest. Test storage and retrieval of json data.
2. Platforms/MariaDb1043PlatformTest. A clone of MariaDb1027Platform test.
3. Schema/MySQLSchemaManagerTest->testColumnIntrospection(). Ensures
   introspected table matches original table. Tests all doctrine types not
   just json. Based on failure highlighted in pull request doctrine#5100.

For previous discussion on this topic, see:
doctrine#5100
doctrine#3202

Further background at:
https://mariadb.com/kb/en/json-data-type/
https://mariadb.com/kb/en/information-schema-check_constraints-table/
https://jira.mariadb.org/browse/MDEV-13916

v2
JsonTest.php:
1. Simplify array shapes of json structures.

2. Deal with json returned as a stream not string.

   It seems Oracle 11 returns json as a stream so convert streams to
   strings.

v3
JsonTest.php:

1. 'json_table' is a reserved word in MySQL (and a function name in
   Oracle SQL). Use json_test_table instead.

2. Some databases e.g. MySQL reorder json arrays so the insert order
   will not necessarily match the select order. Resort the arrays
   before testing for equality.

MySQLSchemaManager.php:

Remove 'instanceof MariaDb1027Platform' test as follows:

a. Add getColumnTypeSQLSnippets() method to AbstractMySQLPlatform which
   returns the default snippets for all databases except MariaDb.

b. Override the method in MariDBPlatform to return the SQL snippets
   which inverse the JSON to LONGTEXT aliasing.

Note that the getColumnTypeSQLSnippets() method in MariaDBPlatform was
previously called getInverseJsonToLongtextAliasSQL() and was renamed to
make more sense for MySQL platforms in general.
cgknx added a commit to cgknx/dbal that referenced this pull request Mar 4, 2023
MariaDb aliases columns specified as JSON to LONGTEXT. Since version 10.4.3,
MariaDb adds a CHECK constraint to JSON columns to ensure they contain valid
JSON and sets the column collation to utf8mb4_bin.

Simply setting JSON as the column type in the platform results in introspection
failures as the database reports LONGTEXT (not JSON) with changed collation.

Therefore, inverse the aliasing where the relevant check constraint exists and
ignore collation changes for JSON columns in the relevant database versions.

The modified methods are MariaDBPlatform->getListTableColumnsSQL() and
MySQLSchemaManager->selectTableColumns()

The changes have been implemented in the MariaDBPlatform (guarded by a check
for JSON as column type so they do not affect MariaDB1027Platform) and
MySQLSchemaManager (guarded by a platform test). It ought to be possible to
push most of the platform changes into the MySQLPlatform given the SQL query
changes are guarded by MariaDB specific executable comments and so should be a
no op for MySQL. Doing so would allow getDatabaseNameSQL() to be kept private.
However, because the changes are entirely MariaDb specific they have been made
in the MariaDBPlatform (this is also consistent with the plan to separate
MariaDb and MySQL platforms).

MariaDb1043Platform extends MariaDb1027Platform rather than MariaDBPlatform to
ensure the platform checks in MySQLSchemaManager (based on MariaDb1027Platform)
continue to work as expected.

The alternative approach of adding the CHECK constraint to the SQL for the
column is not robust as MariaDb syntax seems to require the CHECK constraint to
be the final part of the column declaration and it was not obvious how to
guarantee this (specifically, appending the CHECK constraint to the output of
getJsonTypeDeclarationSQL() did not work).

New public methods:
MariaDBPlatform->getInverseJsonToLongtextAliasSQL() generates SQL snippets to
inverse the JSON to LONGTEXT aliasing for JSON-specified columns. It is marked
public because it is used in MySQLSchemaManager.

Overridden methods:
MariaDb1043Platform->getJsonTypeDeclarationSQL(). To return JSON.

MariaDBPlatform->getListTableColumnsSQL(). To reverse the JSON to LONGTEXT
                                           aliasing carried out by MariaDb.

MariaDBPlatform->MariaDBColumnToArray(). To unset collation and charset in
                                         column array (so parent needed to be
                                         marked protected).

New test cases:
1. Types/JsonTest. Test storage and retrieval of json data.
2. Platforms/MariaDb1043PlatformTest. A clone of MariaDb1027Platform test.
3. Schema/MySQLSchemaManagerTest->testColumnIntrospection(). Ensures
   introspected table matches original table. Tests all doctrine types not
   just json. Based on failure highlighted in pull request doctrine#5100.

For previous discussion on this topic, see:
doctrine#5100
doctrine#3202

Further background at:
https://mariadb.com/kb/en/json-data-type/
https://mariadb.com/kb/en/information-schema-check_constraints-table/
https://jira.mariadb.org/browse/MDEV-13916

v2
JsonTest.php:
1. Simplify array shapes of json structures.

2. Deal with json returned as a stream not string.

   It seems Oracle 11 returns json as a stream so convert streams to
   strings.

v3
JsonTest.php:

1. 'json_table' is a reserved word in MySQL (and a function name in
   Oracle SQL). Use json_test_table instead.

2. Some databases e.g. MySQL reorder json arrays so the insert order
   will not necessarily match the select order. Resort the arrays
   before testing for equality.

MySQLSchemaManager.php:

Remove 'instanceof MariaDb1027Platform' test as follows:

a. Add getColumnTypeSQLSnippets() method to AbstractMySQLPlatform which
   returns the default snippets for all databases except MariaDb.

b. Override the method in MariDBPlatform to return the SQL snippets
   which inverse the JSON to LONGTEXT aliasing.

Note that the getColumnTypeSQLSnippets() method in MariaDBPlatform was
previously called getInverseJsonToLongtextAliasSQL() and was renamed to
make more sense for MySQL platforms in general.
cgknx added a commit to cgknx/dbal that referenced this pull request Mar 5, 2023
MariaDb aliases columns specified as JSON to LONGTEXT. Since version 10.4.3,
MariaDb adds a CHECK constraint to JSON columns to ensure they contain valid
JSON and sets the column collation to utf8mb4_bin.

Simply setting JSON as the column type in the platform results in introspection
failures as the database reports LONGTEXT (not JSON) with changed collation.

Therefore, inverse the aliasing where the relevant check constraint exists and
ignore collation changes for JSON columns in the relevant database versions.

The main functional changes are in:
 - MariaDB1043Platform->getListTableColumnsSQL() and
 - MySQLSchemaManager->selectTableColumns()

MariaDb1043Platform extends MariaDb1027Platform rather than MariaDBPlatform
to ensure the platform checks in MySQLSchemaManager (based on
MariaDb1027Platform) continue to work as expected.

The alternative approach of adding the CHECK constraint to the SQL for the
column is not robust as MariaDb syntax seems to require the CHECK constraint to
be the final part of the column declaration and it was not obvious how to
guarantee this (specifically, appending the CHECK constraint to the output of
getJsonTypeDeclarationSQL() did not work).

New public methods:
MariaDBPlatform->getColumnTypeSQLSnippets() generates SQL snippets to
reverse the JSON to LONGTEXT aliasing for JSON-specified columns. It is also
used in MySQLSchemaManager. It still checks that json columns are JSON data
type so that switching the getJsonTypeDeclarationSQL to LONGTEXT is all that
is necessary to revert to old behaviour which is helpful for testing.

Overridden methods in MariaDb1043Platform:
MariaDb1043Platform->getJsonTypeDeclarationSQL(). To return JSON.

MariaDBPlatform->getListTableColumnsSQL(). To reverse the JSON to LONGTEXT
                                           aliasing carried out by MariaDb.

MariaDBPlatform->getColumnDeclarationSQL(). To unset collation and charset
                                            (used by the comparator).

New test cases:
1. Types/JsonTest. Test storage and retrieval of json data.
2. Platforms/MariaDb1043PlatformTest. A clone of MariaDb1027Platform test.
3. Schema/MySQLSchemaManagerTest->testColumnIntrospection(). Ensures
   introspected table matches original table. Tests all doctrine types not
   just json. Based on failure highlighted in pull request doctrine#5100.

For previous discussion on this topic, see:
doctrine#5100
doctrine#3202

Further background at:
https://mariadb.com/kb/en/json-data-type/
https://mariadb.com/kb/en/information-schema-check_constraints-table/
https://jira.mariadb.org/browse/MDEV-13916

Notes regarding JsonTest.php:
1. Some platforms return json as a stream so convert streams to strings.

2. 'json_table' is a reserved word in MySQL (and a function name in
   Oracle SQL). Use json_test_table instead.

3. Some platforms reorder json arrays so the insert order will not
   necessarily match the select order. Resort the arrays before testing
   for equality.
cgknx added a commit to cgknx/dbal that referenced this pull request Mar 6, 2023
MariaDb aliases columns specified as JSON to LONGTEXT. Since version 10.4.3,
MariaDb adds a CHECK constraint to JSON columns to ensure they contain valid
JSON and sets the column collation to utf8mb4_bin.

Simply setting JSON as the column type in the platform results in introspection
failures as the database reports LONGTEXT (not JSON) with changed collation.

Therefore, inverse the aliasing where the relevant check constraint exists and
ignore collation changes for JSON columns in the relevant database versions.

The main functional changes are in:
 - MariaDB1043Platform->getListTableColumnsSQL() and
 - MySQLSchemaManager->selectTableColumns()

MariaDb1043Platform extends MariaDb1027Platform rather than MariaDBPlatform
to ensure the platform checks in MySQLSchemaManager (based on
MariaDb1027Platform) continue to work as expected.

The alternative approach of adding the CHECK constraint to the SQL for the
column is not robust as MariaDb syntax seems to require the CHECK constraint to
be the final part of the column declaration and it was not obvious how to
guarantee this (specifically, appending the CHECK constraint to the output of
getJsonTypeDeclarationSQL() did not work).

New public methods:
MariaDBPlatform->getColumnTypeSQLSnippets() generates SQL snippets to
reverse the JSON to LONGTEXT aliasing for JSON-specified columns. It is also
used in MySQLSchemaManager. It still checks that json columns are JSON data
type so that switching the getJsonTypeDeclarationSQL to LONGTEXT is all that
is necessary to revert to old behaviour which is helpful for testing.

Overridden methods in MariaDb1043Platform:
MariaDb1043Platform->getJsonTypeDeclarationSQL(). To return JSON.

MariaDBPlatform->getListTableColumnsSQL(). To reverse the JSON to LONGTEXT
                                           aliasing carried out by MariaDb.

MariaDBPlatform->getColumnDeclarationSQL(). To unset collation and charset
                                            (used by the comparator).

New test cases:
1. Types/JsonTest. Test storage and retrieval of json data.
2. Platforms/MariaDb1043PlatformTest. A clone of MariaDb1027Platform test.
3. Schema/MySQLSchemaManagerTest->testColumnIntrospection(). Ensures
   introspected table matches original table. Tests all doctrine types not
   just json. Based on failure highlighted in pull request doctrine#5100.

For previous discussion on this topic, see:
doctrine#5100
doctrine#3202

Further background at:
https://mariadb.com/kb/en/json-data-type/
https://mariadb.com/kb/en/information-schema-check_constraints-table/
https://jira.mariadb.org/browse/MDEV-13916

Notes regarding JsonTest.php:
1. Some platforms return json as a stream so convert streams to strings.

2. 'json_table' is a reserved word in MySQL (and a function name in
   Oracle SQL). Use json_test_table instead.

3. Some platforms reorder json arrays so the insert order will not
   necessarily match the select order. Resort the arrays before testing
   for equality.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MariaDB 10.3 Native JSON Support
3 participants