Add serializable isolation db connection for postgres #6591
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
SQLite and Postgres have different default read isolation levels. SQLite defaults to the "safest" option, which is "serializable", whereas postgres defaults to "read committed". This means that if a transaction reads some data, then in parallel another transaction that writes some data completes, and then the first transaction reads more data, there may be inconsistencies in the two sets of data the first transaction reads -- i.e. it's not a reliable "snapshot" of the database at a particular point in time (c.f. "non-repeatable reads" and "phantom reads").
At the time we built Morango, Kolibri only supported SQlite, which is what we used for our more intensive manual concurrency testing. When we added support for Postgres, we had assumed that it would be at least as robust on this front as SQLite. This type of edge-case mass concurrency testing is very difficult to capture in automated tests, so we don't currently have coverage.
On KDP, this led to some issues where data wasn't being deserialized from the underlying datastore into the Kolibri app models. No data was lost, but it wasn't showing up in the app layer. This was because the deserialization process loops over all the models and then at the very end clears their "dirty" bits. If new records are written into the datastore from an incoming sync in parallel, they may accidentally also have their "dirty bit" cleared, and hence not be deserialized in future batches.
This PR switches the Postgres connection that Morango uses for its large snapshot transactions to have read isolation level of "SERIALIZABLE", which matches SQLite.
Reviewer guidance
…
References
…
Contributor Checklist
PR process:
Testing:
Reviewer Checklist
yarn
andpip
)