Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent checkpoints during snapshots #477

Merged

Conversation

hifi
Copy link
Collaborator

@hifi hifi commented May 7, 2023

Use a mutex that is held during snapshots and only try locking it during checkpoints.

If the database is in high write pressure and taking a snapshot takes long it is possible for the automatic checkpointing to go from PASSIVE to RESTART.

When a RESTART checkpoint is issued SQLite will block new read transactions and wait for the existing ones to finish. However, during a snapshot Litestream keeps a persistent read transaction open until it finishes which will in turn create a deadlock situation for the checkpointer as the RESTART checkpoint will start blocking writers as long as the snapshot is being written out.

This failure condition doesn't break everything persistently but it will create an unfortunate persistent write lock for the application until the snapshot finishes.

Use a mutex that is held during snapshots and only try locking it during
checkpoints.

If the database is in high write pressure and taking a snapshot takes
long it is possible for the automatic checkpointing to go from PASSIVE
to RESTART.

When a RESTART checkpoint is issued SQLite will block new read
transactions and wait for the existing ones to finish. However, during a
snapshot Litestream keeps a persistent read transaction open until it
finishes which will in turn create a deadlock situation for the
checkpointer as the RESTART checkpoint will start blocking writers as
long as the snapshot is being written out.

This failure condition doesn't break everything persistently but it will
create an unfortunate persistent write lock for the application until
the snapshot finishes.
@hifi
Copy link
Collaborator Author

hifi commented May 11, 2023

This actually fixes #237 as we hit the same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants