Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPLT-1049 Provision separate DB per user #132

Merged
merged 20 commits into from
Jul 21, 2023
Merged

Conversation

morgsmccauley
Copy link
Collaborator

@morgsmccauley morgsmccauley commented Jul 17, 2023

This PR updates the Lambda Runner provisioning step to create a separate Postgres database per user. This provides greater security, preventing users from manipulating other users data within the provided SQL.

I've added the high level steps, as well as some notes on the changes to them, below:

1. Create database/user in Postgres

It's possible to run SQL against Postgres via Hasura, but the provided statements are run within a transaction block. As CREATE DATABASE can not be run within a transaction, we need a direct connection to Postgres to create new databases.

Crypto random passwords are generated for each PG user, these will be stored by Hasura in plain text within its metadata table. Exposing the password/connection string via an environment variable isn't really feasible as this would require: creating a secret, updating the Cloud Run revision, and restarting the instance. We can look at more secure options for this in future.

2. Add database to Hasura

Database connections are added as 'data sources' to Hasura, this makes up the majority of changes to Hasura Client - making source configurable. As mentioned above, these connections are stored in the Hasura metadata tables and are therefore persisted across restarts.

A database will be created per account, we must therefore check if the datasource exists before provisioning to avoid attempting to recreate it when the account creates their second indexer.

3. Run user provided SQL

There are two potential conflicts when provisioning an endpoint:

  1. Postgres - a user could theoretically create two different functions with the same tables, to avoid conflicting table names a new schema is created per function. Since this is scoped within the accounts DB, we can name this after just the function, rather than a concatenation of both account/function like we have currently.
  2. GraphQL - Hasura exposes all tables from all databases, with the above, we could end up with conflicts when two different accounts create the same named function and table(s). To avoid this conflict each database has it's own namespace, which scopes all top-level fields to that namespace.

The end result is a query like to the following:

query {
  account_name {
    function_name_table_name {
      column
    }
  }
}

Instead of namespaces, we could create a DB schema named after both account ID/function, which would result in the top-level fields we have currently, i.e. account_name_function_name_table_name. But I believe the namespace reduces the noise and makes GraphQL operations more readable.

4. Track tables, foreign key relationships, and add permissions

These steps are mostly unchanged, but have been slightly refactored to take in to account execution against different databases/sources.

Other Considerations

By default, all users are able to connect to the template databases, which are used as 'templates' for new databases. To prevent users from modifying these templates we should remove PUBLIC access from them:

REVOKE CONNECT ON DATABASE template0 FROM PUBLIC
REVOKE CONNECT ON DATABASE template1 FROM PUBLIC

@morgsmccauley morgsmccauley force-pushed the DPLT-1049-db-per-user branch 3 times, most recently from d5403f0 to 62ff02b Compare July 17, 2023 23:19
@morgsmccauley morgsmccauley marked this pull request as ready for review July 18, 2023 09:08
@morgsmccauley morgsmccauley requested a review from a team as a code owner July 18, 2023 09:08
@morgsmccauley

This comment was marked as resolved.

@morgsmccauley morgsmccauley removed the request for review from a team July 19, 2023 00:11
@morgsmccauley morgsmccauley requested a review from a team July 19, 2023 00:11
@morgsmccauley morgsmccauley merged commit 8e18566 into main Jul 21, 2023
1 check passed
@morgsmccauley morgsmccauley deleted the DPLT-1049-db-per-user branch July 21, 2023 01:19
morgsmccauley added a commit that referenced this pull request Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants