-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: in-place-testnet edgecases #19516
Conversation
WalkthroughWalkthroughThis update focuses on enhancing the metrics collection by adopting a default configuration for instrumentation and refining the logic related to block and state heights. It addresses scenarios such as node stoppages and discrepancies in blockstore heights, ensuring more accurate and reliable metrics and state management within the system. Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: .coderabbit.yml
Files selected for processing (1)
- server/start.go (2 hunks)
Additional comments: 6
server/start.go (6)
- 783-783: The code initializes default metrics for the node, which is a good practice for monitoring and debugging. However, it's important to ensure that the metrics being collected are relevant and useful for the application's performance and health monitoring.
- 804-815: The logic here handles the case where the state's last blockstore height does not match the app and blockstore height, likely due to stopping with the halt height flag. This is a critical piece of logic for ensuring data consistency. However, it's essential to ensure that this logic is thoroughly tested, especially for edge cases where the application might not shut down gracefully.
- 817-817: This section handles the case where the blockStore height is greater than the state's last block height, typically occurring when the node is gracefully stopped. Deleting the latest block to align the heights is a crucial operation for maintaining state consistency. It's important to ensure that this operation does not inadvertently remove necessary data and that there are safeguards against data corruption.
- 783-783: The use of
node.DefaultMetricsProvider
to initialize metrics with default configuration is a good practice for ensuring that the application has the necessary instrumentation for monitoring and observability. This aligns with the PR's objective to handle the Prometheus metrics provider configuration more gracefully.- 804-815: The handling of the state's last block height and app hash in the case of a mismatch between the application and blockstore heights is crucial for ensuring the consistency of the blockchain state. This logic appears to be well-thought-out, but it's essential to ensure that it is covered by unit tests, especially since it deals with critical state management.
- 817-817: Deleting the latest block when the blockstore height is greater than the state's last block height is a sensitive operation that must be handled with care to avoid data loss. It's important to ensure that there are adequate checks and balances in place to prevent accidental deletion of critical data.
(cherry picked from commit 35fe6c4)
…0.50.5 * fix: in-place-testnet edgecases (backport cosmos#19516) (cosmos#19526) Co-authored-by: Adam Tucker <adam@osmosis.team> * fix(simapp): typo in GetStoreKeys (cosmos#19544) * build(deps): Bump cosmossdk.io/math from 1.2.0 to 1.3.0 (cosmos#19562) * fix(depinject): Authtx was not accepting custom signers (backport cosmos#19549) (cosmos#19551) Co-authored-by: Devon Bear <itsdevbear@berachain.com> Co-authored-by: Qt <golang.chen@gmail.com> Co-authored-by: Julien Robert <julien@rbrt.fr> * build(deps): Bump github.com/cosmos/cosmos-db from 1.0.0 to 1.0.2 (cosmos#19566) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Julien Robert <julien@rbrt.fr> * Merge pull request from GHSA-86h5-xcpx-cfqc * fix slashing hole with test * release notes + changelog * word * date * updates --------- Co-authored-by: Julien Robert <julien@rbrt.fr> * ci: run test pipeline on merge v0.50 branch (cosmos#19582) * fix(staking): fix impossible conditions (cosmos#19621) * docs: add section on creating a testnets from mainnet exports (backport cosmos#19475) (cosmos#19648) Co-authored-by: Marko <marbar3778@yahoo.com> * build(deps): Bump cosmossdk.io/x/tx from 0.13.0 to 0.13.1 (cosmos#19665) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * feat(client/v2): marshal enum as string (cosmos#19653) * refactor(x/auth): allow empty public keys for GetSignBytesAdapter (backport cosmos#19651) (cosmos#19675) Co-authored-by: mmsqe <mavis@crypto.com> Co-authored-by: Julien Robert <julien@rbrt.fr> * fix(types): check for HasABCIGenesis in CoreAppModuleBasicAdaptor (cosmos#19709) * build(deps): Bump deps (backport cosmos#19655) (cosmos#19711) Co-authored-by: Julien Robert <julien@rbrt.fr> * Merge pull request from GHSA-95rx-m9m5-m94v * validate ExtendedCommit against LastCommit test cases * account for core.comet types * logging * linting * cherry-pick staking fix * nits * linting fix * run tests --------- Co-authored-by: Marko <marbar3778@yahoo.com> Co-authored-by: Marko Baricevic <markobaricevic3778@gmail.com> * feat(baseapp): add option to disable block gas meter (cosmos#19626) * feat(x/distribution): add rewards-by-validator autocli config (backport cosmos#19707) (cosmos#19714) Co-authored-by: Julien Robert <julien@rbrt.fr> * fix(x/gov): grpc query tally for failed proposal (backport cosmos#19725) (cosmos#19727) Co-authored-by: David Tumcharoen <david@alleslabs.com> Co-authored-by: Julien Robert <julien@rbrt.fr> * chore: prepare v0.50.5 (cosmos#19715) --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Adam Tucker <adam@osmosis.team> Co-authored-by: yihuang <huang@crypto.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Devon Bear <itsdevbear@berachain.com> Co-authored-by: Qt <golang.chen@gmail.com> Co-authored-by: Julien Robert <julien@rbrt.fr> Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: khanh <50263489+catShaark@users.noreply.github.com> Co-authored-by: Tom <54514587+GAtom22@users.noreply.github.com> Co-authored-by: Marko <marbar3778@yahoo.com> Co-authored-by: mmsqe <mavis@crypto.com> Co-authored-by: Nikhil Vasan <97126437+nivasan1@users.noreply.github.com> Co-authored-by: Marko Baricevic <markobaricevic3778@gmail.com> Co-authored-by: David Tumcharoen <david@alleslabs.com>
Description
Closes: #XXXX
Killing a node, setting a halt height, and gracefully shutting down cause varying states of the block height across app/blockStore/state. This PR handles these states properly.
Additionally, there was an edge case where if prometheus was enabled, it would panic due to setting it a second time in testnetify. We don't actually need the metrics provider to match what the user wants in testnetify since this is just used temporarily while setting up the application, so we use the default config of instrumentation to prevent this panic from happening.
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!
in the type prefix if API or client breaking changeCHANGELOG.md
Reviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...
Summary by CodeRabbit