Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[9.x] Address confusion between PostgreSQL concepts "schema" and "search_path" #35463

Merged
merged 5 commits into from
Dec 4, 2020

Conversation

cbj4074
Copy link
Contributor

@cbj4074 cbj4074 commented Dec 2, 2020

This PR is intended to fulfill the objectives described in laravel/ideas#918 , as they apply to this repository (related changes may be needed in laravel/laravel).

In short, this PR replaces the PostgreSQL term "schema" with "search_path". While related and conceptually similar, there is a crucial distinction to be made between the two, which I describe exhaustively in the above-cited post.

Beyond the terminology corrections, we have also sought to fix a significant limitation whereby a search_path that contains more than one schema cannot be specified as a string, e.g., via the effective environment (i.e., an .env file or similar) because the implementation does not parse and quote the specified value correctly.

Finally, we have added several new tests to ensure complete consistency in how the search_path value is passed on to PostgreSQL, regardless of whether the value is specified as:

  • a string that contains only one schema (with or without double-quotes around the schema)
  • a comma-separated string that contains more than one schema (with or without double-quotes around each schema)
  • an array that contains any number of schemas (with or without double-quotes around each schema)
  • any of the above variations, but with variable notation, e.g., the schema name begins with a $ sign

At this juncture, I am unaware of any other valid search_path syntax that could possibly need to be supported.

Also, per laravel/ideas#918 , @mfn had expressed an interest in completely expunging the term "schema" from Laravel's lexicon in relation to this issue, with which I more or less agree. However, I left the reference in the configureSearchPath() method so as to provide backwards compatibility, which makes sense only if this can be merged into the 8.x branch. If the changes are already breaking, then I agree that we should completely expunge "schema" and target the next major release, at which time we should also update the default configuration array in https://github.com/laravel/laravel/blob/8.x/config/database.php (by replacing schema with search_path), and update any affected documentation, if any actually exists (I have yet to see any description of PostgreSQL's schema/search_path directive, or what format it should take when specified in Laravel's database configuration).

(UPDATE: This is breaking, so I expunged the incorrect terminology everywhere, with no concern for preserving BC with regard to the database connection configuration key name.)

I'm looking for some guidance from @taylorotwell or whomever else may be qualified in this regard. (UPDATE: driesvints already addressed this.)

Finally, @mfn , you had noted one other method that may be affected, in #22414 (comment) , and I'll take a look at that, as well as address any of your other lingering concerns from the previous PR, before taking this out of Draft mode.

Thank you!

@driesvints
Copy link
Member

Haven't had time yet to read the comment above but changing the method names on a stable branch is a breaking change. Please either revert those changes or send this to master.

@driesvints
Copy link
Member

Also: Taylor doesn't reviews draft PRs. Please mark this as ready once you want it reviewed.

@cbj4074
Copy link
Contributor Author

cbj4074 commented Dec 3, 2020

@mfn Let's circle-back to the method you cited in the previous (and since closed) PR:

* Parse the table name and extract the schema and table.
*
* @param string $table
* @return array
*/
protected function parseSchemaAndTable($table)
{
$table = explode('.', $table);
if (is_array($schema = $this->connection->getConfig('schema'))) {
if (in_array($table[0], $schema)) {
return [array_shift($table), implode('.', $table)];
}
$schema = head($schema);
}
return [$schema ?: 'public', implode('.', $table)];
}

Firstly, that entire class has no test coverage, so, it'll be difficult to determine what might break if anything is modified (unless I/we write new tests, of course).

EDIT: Actually, upon further review, I deleted much of this comment, because this method isn't fundamentally related to the problem that this PR seeks to address.

More specifically, this method is only called inside its own class, and its sole purpose is to parse a table name inside hasTable($table) or getColumnListing($table) so that $schema and $table can be passed as bindings when querying PostgreSQL's information_schema. The point here is that this is basically completely unrelated to the search_path as it's set on the underlying database connection. This method just happens to fetch its value when determining which schema to use in the WHERE clause.

That said, I'm fairly convinced that the logic in that method is not quite right, but this PR certainly won't make it any worse, and it seems best not to modify it any further without adding appropriate tests.

@driesvints
Copy link
Member

driesvints commented Dec 3, 2020

Again I haven't read your entire comment but

Firstly, that entire class has no test coverage, so, it'll be difficult to determine what might break if anything is modified (unless I/we write new tests, of course).

The breaking change is that if an extending class in userland uses these methods it will break because these methods won't be present anymore.

@cbj4074
Copy link
Contributor Author

cbj4074 commented Dec 3, 2020

Thanks @driesvints . I'm with you; this is definitely a breaking change, even if only for the first reason you mentioned (renaming a method).

I'll direct this to the master branch when I push-up the changes I propose to address that parseSchemaAndTable() method.

@GrahamCampbell GrahamCampbell changed the title Address confusion between PostgreSQL concepts "schema" and "search_path" [8.x] Address confusion between PostgreSQL concepts "schema" and "search_path" Dec 4, 2020
poppabear8883 and others added 4 commits December 3, 2020 22:06
More specifically, fix issue whereby individual schema paths in an array were quoted twice.

Also, update arguments in tests to verify schema name parsing with vs. without quotes, for both string and array notations.
@cbj4074 cbj4074 changed the base branch from 8.x to master December 4, 2020 03:10
@cbj4074
Copy link
Contributor Author

cbj4074 commented Dec 4, 2020

@driesvints Okay, I changed the base to master, given the breaking change you noted.

As noted in the OP, this will require that

'schema' => 'public',

on https://github.com/laravel/laravel/blob/0717bb0291a51ab63dd220ce4db8b7fa82e23787/config/database.php#L77 is changed to the following in the release:

'search_path' => 'public',

@cbj4074 cbj4074 marked this pull request as ready for review December 4, 2020 03:51
@driesvints driesvints changed the title [8.x] Address confusion between PostgreSQL concepts "schema" and "search_path" [9.x] Address confusion between PostgreSQL concepts "schema" and "search_path" Dec 4, 2020
@taylorotwell taylorotwell merged commit bd17679 into laravel:master Dec 4, 2020
@taylorotwell
Copy link
Member

@cbj4074 we recently had a commit on 8.x that referenced the Postgres schema (7be50a5)

We need your help to make sure this is working correctly for master and generally tell us how to handle these things in the future.

@cbj4074
Copy link
Contributor Author

cbj4074 commented Jan 22, 2021

@taylorotwell Sure, no problem at all. I'll take a look at it sometime in the next 24 hours. 👍

@cbj4074
Copy link
Contributor Author

cbj4074 commented Jan 23, 2021

@taylorotwell I left some initial thoughts in a comment on the commit you referenced, but until I can test it in a more realistic environment, with multiple schemas, I can't be 100% sure as to its viability. I'll be able to do that on Monday, at the latest.

I will also rough-out the conceptual aspects of a PR over the next couple of days that explains what would be required to make the dump and restore tools work correctly in a multi-schema and multi-database environment, which requires and builds upon the previous fixes that I added in 9.x.

@cbj4074
Copy link
Contributor Author

cbj4074 commented Jan 25, 2021

Okay, regarding the schema:dump issue, I think I have a simple, elegant solution for this, which will eliminate the need for schema/search-path logic altogether and apply equally to 8.x and 9.x I'll post a PR in a couple hours.

@cbj4074
Copy link
Contributor Author

cbj4074 commented Jan 26, 2021

@taylorotwell #36046 , which is made against 8.x, should fix the dumping behavior in all scenarios (the reloading is another matter; see note at bottom of PR). Given that we can eliminate the search_path handling in this context altogether, the dump() method can be made exactly the same in 9.x.

With regard to how we handle this and similar scenarios in the future, something to keep in mind going forward is that the search_path can take any of the following forms in config/database.php:

'search_path' => 'public',
'search_path' => 'public,laravel',
'search_path' => ['public', '"laravel"', "'foobar'", '$bat'],
'search_path' => '\'public\', "laravel", "\'foobar\'", \'$bat\'',
'search_path' => '"$user", public',

For this reason, we should only ever use the parseSearchPath() method to retrieve the schema list as an array, and then do any imploding or similar from there. There are too many wacky scenarios for which the aforementioned method already accounts to risk recreating that logic anywhere else, ad-hoc.

That said, there are essentially duplicate copies of that method, which, ideally, would be consolidated into a single method:

protected function parseSearchPath($searchPath)

protected function parseSearchPath($searchPath)

Beyond that general guidance, the search_path is a tricky beast, and each situation demands careful scrutiny if we are to get it right. A perfect example is this schema:dump behavior; the best approach here was not to use it at all. 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants