-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use lambdas for transaction scoping; Don't reload objects after commit #552
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The way transactions were previously handled made it impossible to compose multiple transactional operations into larger tasks. `PersistenceManager#currentTransaction()#begin()` fails when the transaction is already active. This in turn opens the door to data inconsistencies, when parts of a task are committed but later operations fail, with no way to roll back to the original state before the task was started. It also encouraged commits in high frequencies, which can take a toll on the database's write-ahead-log for high volume processing use-cases. There was no handling of rollbacks, such that a failed commit could have "poisoned" an entire `PersistenceManager`, as no further persistence operations can be performed until after a rollback was executed. This change introduces the `Transaction` class, which coordinates transaction begin, commit, and rollback. It is also capable of handling nested transactions, by joining an existing active one if required. This aspect can be controlled via propagation settings. It also removes the need to reload objects from the database after each commit. This behavior has historically added quite a big overhead, and apparently was needed because DataNucleus transitions objects into "hollow" state (all fields except the primary key are unloaded). The unloading of fields is now disabled globally by setting the `DataNucleus.RetainValues` property to `true`. The `ScopedCustomization` class was added to allow for scoped changes to a `PersistenceManager`, without impacting later operations on it. Usually things like adding fetch groups via `pm.getFetchPlan().addGroup("foo")` are intended to only affect one or a handful of queries, but in reality it affects all queries for the lifetime of the `PersistenceManager`. This has historically led to over-fetching. Now, it's easier to scope such things more tightly. Further, this PR makes an effort to ensure that all `Query` objects are closed once consumed. Not closing query objects unnecessary leaks resources in cases where the owning `PersistenceManager` is **not** short-lived. References: * https://www.datanucleus.org/products/accessplatform_6_0/jdo/persistence.html#lifecycle * https://www.datanucleus.org/products/accessplatform_6_0/jdo/query.html#_closing_a_query * https://www.datanucleus.org/products/accessplatform_6_0/jdo/persistence.html#transaction_local * https://github.com/search?q=repo%3AGoogleCloudPlatform%2Fdatanucleus-appengine%20retainvalues&type=code Signed-off-by: nscuro <nscuro@protonmail.com>
nscuro
added a commit
to nscuro/dependency-track
that referenced
this pull request
Apr 20, 2024
The biggest impact is that some tests now require eviction of the L1 cache to see changes made in the functionality they're testing. stevespringett/Alpine#552 Signed-off-by: nscuro <nscuro@protonmail.com>
nscuro
added a commit
to nscuro/Alpine
that referenced
this pull request
May 7, 2024
The `getCount` methods were refactored to not modify the original `Query` object in stevespringett#552. Neither JDO nor DataNucleus offer a mechanism to copy a `Query` object 1:1, hence a new object was created instead. This however broke `getCount` for queries that make use of subqueries (and potentially other advanced features). This change reverts it back to temporarily modify the query object again. Signed-off-by: nscuro <nscuro@protonmail.com>
nscuro
added a commit
to nscuro/dependency-track
that referenced
this pull request
May 10, 2024
This is to properly support nested transactions as introduced in stevespringett/Alpine#552. Signed-off-by: nscuro <nscuro@protonmail.com>
This was referenced May 15, 2024
MM-msr
pushed a commit
to MM-msr/dependency-track
that referenced
this pull request
Jun 18, 2024
This is to properly support nested transactions as introduced in stevespringett/Alpine#552. Signed-off-by: nscuro <nscuro@protonmail.com>
nscuro
added a commit
to DependencyTrack/hyades-apiserver
that referenced
this pull request
Jul 31, 2024
This is to properly support nested transactions as introduced in stevespringett/Alpine#552 Ports DependencyTrack/dependency-track#3711 from Dependency-Track v4.12.0-SNAPSHOT. Signed-off-by: nscuro <nscuro@protonmail.com>
Merged
2 tasks
nscuro
added a commit
to DependencyTrack/hyades-apiserver
that referenced
this pull request
Jul 31, 2024
This is to properly support nested transactions as introduced in stevespringett/Alpine#552 Ports DependencyTrack/dependency-track#3711 from Dependency-Track v4.12.0-SNAPSHOT. Signed-off-by: nscuro <nscuro@protonmail.com>
netomi
pushed a commit
to netomi/dependency-track
that referenced
this pull request
Aug 8, 2024
This is to properly support nested transactions as introduced in stevespringett/Alpine#552. Signed-off-by: nscuro <nscuro@protonmail.com>
nscuro
added a commit
to nscuro/dependency-track
that referenced
this pull request
Dec 8, 2024
Fields are no longer unloaded when a transaction commits (`DataNucleus.RetainValues` is enabled globally), as of stevespringett/Alpine#552. Signed-off-by: nscuro <nscuro@protonmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The way transactions were previously handled made it impossible to compose multiple transactional operations into larger tasks.
PersistenceManager#currentTransaction()#begin()
fails when the transaction is already active. This in turn opens the door to data inconsistencies, when parts of a task are committed but later operations fail, with no way to roll back to the original state before the task was started.It also encouraged commits in high frequencies, which can take a toll on the database's write-ahead-log for high volume processing use-cases.
There was no handling of rollbacks, such that a failed commit could have "poisoned" an entire
PersistenceManager
, as no further persistence operations can be performed until after a rollback was executed.This change introduces the
Transaction
class, which coordinates transaction begin, commit, and rollback. It is also capable of handling nested transactions, by joining an existing active one if required. This aspect can be controlled via propagation settings.It also removes the need to reload objects from the database after each commit. This behavior has historically added quite a big overhead, and apparently was needed because DataNucleus transitions objects into "hollow" state (all fields except the primary key are unloaded). The unloading of fields is now disabled globally by setting the
DataNucleus.RetainValues
property totrue
.The
ScopedCustomization
class was added to allow for scoped changes to aPersistenceManager
, without impacting later operations on it. Usually things like adding fetch groups viapm.getFetchPlan().addGroup("foo")
are intended to only affect one or a handful of queries, but in reality it affects all queries for the lifetime of thePersistenceManager
. This has historically led to over-fetching. Now, it's easier to scope such things more tightly.Further, this PR makes an effort to ensure that all
Query
objects are closed once consumed. Not closing query objects unnecessary leaks resources in cases where the owningPersistenceManager
is not short-lived.References: