Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: expose nodeDB's DeleteVersionsFrom method #952

Merged

Conversation

pirtleshell
Copy link
Contributor

@pirtleshell pirtleshell commented May 24, 2024

DeleteVersionsFrom is the counterpart to DeleteVersionsTo. It writes deletions of a MutableTree to disk for all versions >= the provided fromVersion.

It exposes the method of the same name from the underlying nodeDB.

This method is useful for things like:

  • rolling back an upgrade that adds a module (allows for the complete removal of a KVStore from disk)
  • custom implementations of sharded data drives (allows one to pare down a data drive to a specific set of versions)

Please let me know if this should first be proposed as a Feature Request via the repo issues.

Summary by CodeRabbit

  • New Features

    • Introduced the ability to delete data versions from a specified point with the new DeleteVersionsFrom API.
  • Tests

    • Updated and added tests to ensure the integrity and functionality of the new version deletion feature.

@pirtleshell pirtleshell requested a review from a team as a code owner May 24, 2024 18:09
Copy link

coderabbitai bot commented May 24, 2024

Walkthrough

This update introduces the DeleteVersionsFrom API for managing versions in the MutableTree data structure. It enhances version control by allowing deletion from a specified version onward. Key changes include the addition of new methods in mutable_tree.go, related tests in mutable_tree_test.go, and documentation updates in CHANGELOG.md. This feature complements the existing functionalities like async pruning of legacy nodes and the new SaveChangeSet API.

Changes

Files Change Summaries
CHANGELOG.md Summarized the addition of DeleteVersionsFrom(int64) API and its integration in PR #952.
mutable_tree.go Added DeleteVersionsFrom method to MutableTree struct; modified existing method.
mutable_tree_test.go Renamed TestDelete to TestDeleteVersionsTo and added TestDeleteVersionsFrom for new functionality.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant MutableTree
    Client->>MutableTree: DeleteVersionsFrom(version int64)
    MutableTree-->>Client: Success/Error Response
Loading

Poem

In a forest of code, trees grow so tall,
A rabbit hops lightly to answer the call.
With DeleteVersionsFrom, old leaves fall away,
New branches can flourish, clear skies on display.
Cheers to the code, where changes are spun,
Version control under the warm, digital sun. 🌳🐇


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@cool-develope
Copy link
Collaborator

@pirtleshell thanks for the contribution, it is used inLoadVersionForOverwriting to rollback the state as you mentioned, could you please provide more context of why the specific API is needed?

@pirtleshell
Copy link
Contributor Author

@cool-develope sure thing.

rollback case

First, suppose you are adding a new sdk module during an upgrade. Suppose there is a bug in that module addition such that the root hash of the new module is non-deterministic & results in an app hash mismatch when the upgrade is run in a network. The goal is to roll back the bad version & re-run the upgrade with a patched binary.

The SDK's rollback cmd calls LoadVersionForOverwriting, which works by loading the previous version of the data. If an upgrade handler adds a module via StoreUpgrades, the rollback of the upgrade block will panic with an error that looks like:

Error: failed to rollback to version: version does not exist

This occurs when LoadVersionForOverwriting is called on the newly added module's store with the previous height, which does not exist because the module was just added.

AFAIK, to get around this and successfully roll back, you have two options:

  • use the previous binary to rollback, it doesn't know about the new module so won't attempt to load it
  • manually de-register the module from the root multistore so LoadVersionForOverwriting is never called for it

However, either of those options still leaves the upgrade version of the added module's store on disk.

When the upgrade is re-attemtped, the new data will not be committed because of the version that already exists on disk. The relevant sdk code is here:
https://github.com/cosmos/cosmos-sdk/blob/c4d9a495052b78a02723ed1df1d336efe83a4e8d/store/rootmulti/store.go#L1188-L1197

The code is necessary to recover from interruptions, but means that new data is never committed & written to disk for the patched binary's upgrade block. If it were to manage calling commit, the IAVL Commit() method would likely error here due to mismatched data (the patched binary's version will differ from the bad one that was committed in the failed upgrade).

To truly rollback & re-run the upgrade, the data must be removed from disk.

sharding case

Less important with the awesome space-saving y'all have been doing with iavl v1 & v2, but DeleteVersionsFrom could also be used to create dbs for shards; ie. one server has blocks 1 to 2.5M, another has blocks 2.5M to 5M. You could create these dbs from a full archive node using DeleteVersionsTo and DeleteVersionsFrom


I recognize that much of what I describe about rollback could/should be accounted for higher up in the sdk (I've opened an issue: cosmos/cosmos-sdk#20472). Exposing DeleteVersionsFrom is a just a simple non-breaking change that gives developers more control over the disk's data. It helped me resolve an app hash mismatch by rolling back an upgrade that added modules, so I figured it might help others as well, and is worth adding to this package.

Happy to make any changes, explain anything further, or open an issue/discussion.

@cool-develope
Copy link
Collaborator

@cool-develope sure thing.

rollback case

First, suppose you are adding a new sdk module during an upgrade. Suppose there is a bug in that module addition such that the root hash of the new module is non-deterministic & results in an app hash mismatch when the upgrade is run in a network. The goal is to roll back the bad version & re-run the upgrade with a patched binary.

The SDK's rollback cmd calls LoadVersionForOverwriting, which works by loading the previous version of the data. If an upgrade handler adds a module via StoreUpgrades, the rollback of the upgrade block will panic with an error that looks like:

Error: failed to rollback to version: version does not exist

This occurs when LoadVersionForOverwriting is called on the newly added module's store with the previous height, which does not exist because the module was just added.

AFAIK, to get around this and successfully roll back, you have two options:

  • use the previous binary to rollback, it doesn't know about the new module so won't attempt to load it
  • manually de-register the module from the root multistore so LoadVersionForOverwriting is never called for it

However, either of those options still leaves the upgrade version of the added module's store on disk.

When the upgrade is re-attemtped, the new data will not be committed because of the version that already exists on disk. The relevant sdk code is here: https://github.com/cosmos/cosmos-sdk/blob/c4d9a495052b78a02723ed1df1d336efe83a4e8d/store/rootmulti/store.go#L1188-L1197

The code is necessary to recover from interruptions, but means that new data is never committed & written to disk for the patched binary's upgrade block. If it were to manage calling commit, the IAVL Commit() method would likely error here due to mismatched data (the patched binary's version will differ from the bad one that was committed in the failed upgrade).

To truly rollback & re-run the upgrade, the data must be removed from disk.

sharding case

Less important with the awesome space-saving y'all have been doing with iavl v1 & v2, but DeleteVersionsFrom could also be used to create dbs for shards; ie. one server has blocks 1 to 2.5M, another has blocks 2.5M to 5M. You could create these dbs from a full archive node using DeleteVersionsTo and DeleteVersionsFrom

I recognize that much of what I describe about rollback could/should be accounted for higher up in the sdk (I've opened an issue: cosmos/cosmos-sdk#20472). Exposing DeleteVersionsFrom is a just a simple non-breaking change that gives developers more control over the disk's data. It helped me resolve an app hash mismatch by rolling back an upgrade that added modules, so I figured it might help others as well, and is worth adding to this package.

Happy to make any changes, explain anything further, or open an issue/discussion.

makes sense, thanks for the detail explanation!

@cool-develope
Copy link
Collaborator

@pirtleshell do you think we also need to update cosmos-sdk for the upgrade rollback case?

Copy link
Collaborator

@cool-develope cool-develope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@pirtleshell
Copy link
Contributor Author

@pirtleshell do you think we also need to update cosmos-sdk for the upgrade rollback case?

yes, i wanted to get this into iavl first but i do have a rough implementation of an update to the rollback command that fully removes the KVStore on disk for listed modules. i linked it in the issue i opened on cosmos/cosmos-sdk#20472

probably needs cleanup but it is here: Kava-Labs/cosmos-sdk#546
currently that code uses a forked iavl v1.2 that has the patch from this PR applied.

thinking through it right now, the cosmos-sdk versions that use iavl v0.20 can & probably should use DeleteVersionRange to accomplish the same.

@pirtleshell pirtleshell force-pushed the pirtleshell/delete-versions-from branch from ffd5362 to 7908810 Compare May 29, 2024 19:06
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range and nitpick comments (1)
CHANGELOG.md (1)

Line range hint 167-167: Consider adding a comma after "v1.17" for better readability.

- Use go v1.17  For previous changelogs visit: <https://github.com/cosmos/iavl/blob/v0.18.0/CHANGELOG.md>
+ Use go v1.17, For previous changelogs visit: <https://github.com/cosmos/iavl/blob/v0.18.0/CHANGELOG.md>

@pirtleshell
Copy link
Contributor Author

@cool-develope anything else needed for this PR? i think i may have confused the CI by rebasing on master to fix CHANGELOG conflict. 🙂

@pirtleshell
Copy link
Contributor Author

bump @cool-develope @tac0turtle
could someone trigger the Test CI so this can be auto-merged?
thanks!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between c628e70 and 148f43f.

Files selected for processing (1)
  • mutable_tree.go (1 hunks)
Additional comments not posted (1)
mutable_tree.go (1)

872-880: New method DeleteVersionsFrom added to MutableTree.

The implementation of DeleteVersionsFrom method is straightforward and matches the intended functionality described in the PR. This method complements the existing DeleteVersionsTo by providing a way to delete versions from a specified version upwards. It's important to ensure that this method does not introduce any regressions or affect the existing functionality adversely.

  • Please ensure that there are sufficient tests covering scenarios where versions just above and just below the fromVersion are correctly handled.
  • Additionally, verify that no data corruption occurs when this method is used in conjunction with other tree operations.
Verification successful

Test coverage for DeleteVersionsFrom method is adequate.

The TestDeleteVersionsFrom function in mutable_tree_test.go verifies that versions below the specified fromVersion remain unaffected while versions above are correctly deleted. This ensures that the DeleteVersionsFrom method works as intended without causing data corruption or regression.

  • The test sets up a MutableTree, saves multiple versions, and deletes versions from a specified version upwards.
  • It then checks for the existence of versions below the fromVersion and confirms the non-existence of versions above it.
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify that `DeleteVersionsFrom` does not affect versions below `fromVersion`.

# Test: Search for test cases for `DeleteVersionsFrom`.
rg --type go "TestDeleteVersionsFrom" mutable_tree_test.go

Length of output: 102



Script:

#!/bin/bash
# Extract the TestDeleteVersionsFrom function code from mutable_tree_test.go for review.
ast-grep --lang go --pattern $'func TestDeleteVersionsFrom(t *testing.T) {
  $$$
}' mutable_tree_test.go

Length of output: 1598

@pirtleshell
Copy link
Contributor Author

bump @cool-develope @tac0turtle

is anything further needed for this to be included?

@tac0turtle
Copy link
Member

@pirtleshell could merge main into your branch then we can merge it

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 148f43f and 7b226b3.

Files selected for processing (3)
  • CHANGELOG.md (1 hunks)
  • mutable_tree.go (1 hunks)
  • mutable_tree_test.go (2 hunks)
Additional comments not posted (4)
CHANGELOG.md (1)

7-7: Validate the changelog entries for new APIs.

The entries for the new APIs DeleteVersionsFrom and GetLatestVersion are correctly documented under the "Improvements" section. This aligns with the changes made in the PR.

mutable_tree.go (1)

877-885: Review the implementation of DeleteVersionsFrom.

The method DeleteVersionsFrom is correctly implemented to queue the deletion of versions from a specified version upwards. The method defers the execution and does not block the SaveVersion() call, which is in line with the described functionality in the PR.

mutable_tree_test.go (2)

Line range hint 115-134: Review of TestDeleteVersionsTo Function

This function tests the deletion of versions up to a specified version. The logic appears correct, and the use of require ensures that the test will halt on failure, which is appropriate for unit tests. However, the function name has been changed from a more general TestDelete to a more specific TestDeleteVersionsTo. This is a positive change as it clarifies the intent of the test.

  • Correctness: The test correctly sets up a tree, saves versions, and then attempts to delete versions up to a specified point. It then checks if the versions have been deleted as expected.
  • Performance: The test is efficient in terms of operations performed.
  • Best Practices: Using require for immediate test failure upon a condition failure is a good practice in unit tests.

136-161: Review of TestDeleteVersionsFrom Function

This newly added function tests the deletion of versions from a specified version onwards. The structure and logic of the test are consistent with the existing testing patterns in the file.

  • Correctness: The function correctly tests the new DeleteVersionsFrom method by setting a tree, saving versions, and then deleting from a specified version onwards. It checks the existence of versions before and after the specified version to ensure proper deletion.
  • Error Handling: Proper error handling is used with require.NoError to ensure that any error in the setup or deletion process halts the test.
  • Performance: The operations are minimal and focused, which is suitable for unit testing.
  • Best Practices: The test is well-structured and follows good practices in testing expected and edge cases.

Comment on lines +3 to +4
## Unreleased

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure proper formatting in the changelog.

It's good practice to maintain consistent formatting in changelog entries. Ensure that there are blank lines surrounding major sections for better readability.

## Unreleased
+

Committable suggestion was skipped due to low confidence.

@pirtleshell
Copy link
Contributor Author

bumping again, hoping to get this merged @tac0turtle @cool-develope 🙏

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 7b226b3 and 1b94ee0.

Files selected for processing (2)
  • mutable_tree.go (1 hunks)
  • mutable_tree_test.go (2 hunks)
Additional comments not posted (3)
mutable_tree.go (1)

876-884: LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to DeleteVersionsFrom are correctly implemented.

Verification successful

Function usage is correctly implemented and tested.

The function DeleteVersionsFrom is used and tested appropriately in the codebase.

  • mutable_tree_test.go: The function is tested to ensure it works as expected.
  • nodedb_test.go: The function is also tested here to ensure no deadlocks occur.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `DeleteVersionsFrom` match the new signature.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type go -A 5 $'DeleteVersionsFrom'

Length of output: 2584

mutable_tree_test.go (2)

Line range hint 114-134:
LGTM!

The code changes are approved.


135-160: LGTM!

The code changes are approved.

@cool-develope cool-develope enabled auto-merge (squash) July 26, 2024 12:55
@cool-develope cool-develope merged commit 558a18c into cosmos:master Jul 26, 2024
8 checks passed
@pirtleshell pirtleshell deleted the pirtleshell/delete-versions-from branch July 30, 2024 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants