Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of PendingRollbackError when processing large datasets #307

Merged
merged 1 commit into from
May 15, 2024

Conversation

kjsanger
Copy link
Member

@kjsanger kjsanger commented May 14, 2024

This exception is raised when the database disconnects mid-operation and has to be handled outside the ORM.

This change includes some changes to the DB engine init args which make getting bad/stale connections less likely. It also changes the main work loop to operate on batches of paths, using a fresh DB session for each and adds handling for the remainder of a batch should an exception occur somewhere in the middle of it. SQLAlchemy errors are handled specially, by propagating them up to where they can trigger a rollback and allow work to continue with a new DB session.

This updates the Python version requirement to 3.12 (to get itertools.batched())

@kjsanger kjsanger added the bug Something isn't working label May 14, 2024
@kjsanger kjsanger changed the title Fix handling of PendingRollbackError whne processing large datasets Fix handling of PendingRollbackError when processing large datasets May 14, 2024
@kjsanger kjsanger force-pushed the bug/mlwh-session-management branch 2 times, most recently from c102831 to b39a81d Compare May 14, 2024 14:41
This exception is raised when the database disconnects mid-operation
and has to be handled outside the ORM.

This change includes some changes to the DB engine init args which
make getting bad/stale connections less likely. It also changes the
main work loop to operate of batches of paths, using a fresh DB
session for each and adds handling for the remaindre of a batch should
an exception occur somewhere in the middle of it. SQLAlchemy errors
are handled specially, by propagating them up to where they can
trigger a rollback and allow work to continue with a new DB session.

This updates the Python version requirement to 3.12 (to get
itertools.batched())
@kjsanger kjsanger force-pushed the bug/mlwh-session-management branch from b39a81d to 044aeed Compare May 14, 2024 15:39
@mksanger mksanger merged commit 930503d into wtsi-npg:devel May 15, 2024
5 checks passed
@kjsanger kjsanger deleted the bug/mlwh-session-management branch May 16, 2024 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants