Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup(state): Run possible minor database format version upgrades before deleting the disk version #8761

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

arya2
Copy link
Contributor

@arya2 arya2 commented Aug 14, 2024

Motivation

Users could manually change the end of support time in code, use a version of Zebra published without an end-of-support halt, or have an outdated state cache produced by a version of Zebra that has reached its end-of-support but hasn’t been run in a long time (perhaps on Testnet).

We may also publish minor version upgrades in the future shortly before a major version upgrade that can be restored from a previous database format.

Solution

  • Run minor database format upgrades before deleting the older disk version

Related changes:

  • Delete any old cache directories without checking if they're restorable after any restorable version should have been renamed

Tests

Needs manual testing:

  • starting a sync in a new cache directory with an old version of Zebra and checking that minor db format upgrades are applied
  • starting a sync with the most recent release of Zebra and checking that it skips applying minor format upgrades

PR Author's Checklist

  • The PR name will make sense to users.
  • The PR provides a CHANGELOG summary.
  • The solution is tested.
  • The documentation is up to date.
  • The PR has a priority label.

PR Reviewer's Checklist

  • The PR Author's checklist is complete.
  • The PR resolves the issue.

@arya2 arya2 added C-cleanup Category: This is a cleanup I-usability Zebra is hard to understand or use A-state Area: State / database changes P-Low ❄️ labels Aug 14, 2024
@arya2 arya2 self-assigned this Aug 14, 2024
@github-actions github-actions bot added the C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG label Aug 14, 2024
@mpguerra
Copy link
Contributor

How important is this to merge soon? I would like @upbqdn to review it but he's not going to be back for a couple of weeks.

@arya2
Copy link
Contributor Author

arya2 commented Aug 15, 2024

How important is this to merge soon? I would like @upbqdn to review it but he's not going to be back for a couple of weeks.

It's not very important to merge soon. We can wait.

DbFormatChange::Upgrade {
older_disk_version,
newer_running_version,
} if !debug_skip_format_upgrades
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a correctness note about not running if read_only so there's no corruption if multiple Zebra instances try to read and write the same files.

Copy link
Member

@upbqdn upbqdn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we open an issue and do a refactor like this with any upcoming database upgrade instead? The upgrade will give us an opportunity to do thorough manual testing. I can open the issue.

@arya2
Copy link
Contributor Author

arya2 commented Aug 26, 2024

Could we open an issue and do a refactor like this with any upcoming database upgrade instead? The upgrade will give us an opportunity to do thorough manual testing. I can open the issue.

I'd prefer to do thorough manual testing sooner to make outdated caches more usable, but we could split out everything except checking if !debug_skip_format_upgrades before try_reusing_previous_db_after_major_upgrade().

zebra-state/src/config.rs Outdated Show resolved Hide resolved
@arya2 arya2 added do-not-merge Tells Mergify not to merge this PR and removed do-not-merge Tells Mergify not to merge this PR labels Aug 27, 2024
zebra-state/src/service/finalized_state/zebra_db.rs Outdated Show resolved Hide resolved
// It's fine to mark the database format as upgraded later if the database is empty
if finalized_tip.is_some() {
warn!("applying minor database format upgrades ahead of major format version bump, this may take a few minutes");
let (_cancel_tx, cancel_rx) = std::sync::mpsc::channel();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not using the sender will make Zebra irresponsive to shutdowns when the upgrade is running.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like:

std::thread::spawn(move || {
    while !is_shutting_down() {
        if cancel_tx.is_closed() {
            return;
        }

        std::thread::sleep(Duration::from_millis(1));
    }
    cancel_tx.send(CancelFormatChange);
});

might work here, but there's no is_closed() method on this channel, so we'd need to replace it with a watch channel first.

&& RESTORABLE_DB_VERSIONS.contains(&newer_running_version.major) =>
{
warn!("upgrading database format to the next major version");
let db = new_db(Version::new(older_disk_version.major, u64::MAX, u64::MAX));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's document why we're using u64::MAX.

@oxarbitrage oxarbitrage added the do-not-merge Tells Mergify not to merge this PR label Oct 10, 2024
Co-authored-by: Marek <mail@marek.onl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-state Area: State / database changes C-cleanup Category: This is a cleanup C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG do-not-merge Tells Mergify not to merge this PR I-usability Zebra is hard to understand or use P-Low ❄️
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

4 participants