Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

try-state: should indicate which pallet is failing #136

Closed
liamaharon opened this issue Aug 10, 2023 · 1 comment · Fixed by #2019
Closed

try-state: should indicate which pallet is failing #136

liamaharon opened this issue Aug 10, 2023 · 1 comment · Fixed by #2019
Labels
I5-enhancement An additional feature request. T1-FRAME This PR/Issue is related to core FRAME, the framework.

Comments

@liamaharon
Copy link
Contributor

liamaharon commented Aug 10, 2023

Description

try_state error handling more or less seems to follow this pattern:

ensure!(
    invariant_holds,
    "some explanation of the invariant not holding"
)

If !invariant_holds, the try_state call will return a DispatchError::Other(message) to Executive, who calls it like this:

		// run the try-state checks of all pallets, ensuring they don't alter any state.
		let _guard = frame_support::StorageNoopGuard::default();
		<AllPalletsWithSystem as frame_support::traits::TryState<
			BlockNumberFor<System>,
		>>::try_state(*header.number(), select)
		.map_err(|e| {
			frame_support::log::error!(target: LOG_TARGET, "failure: {:?}", e);
			e
		})?;
		drop(_guard);

The issue here is that Executive does not log which pallet the error emitted from, which is vital information when there are multiple instances of the same pallet in the runtime.

I encountered this issue when investigating a collective pallet error in polkadot which has two instances. I needed to add additional logging to try_state and replace the ensure! with a conditional to learn which instance the error is coming from:

        fn do_try_state() -> Result<(), TryRuntimeError> {
+               use frame_support::traits::PalletInfo;
+               let pallet_error_log = "try-state invariant failed for pallet ".to_owned() +
+                       T::PalletInfo::name::<Pallet<T, I>>().unwrap_or("unknown_pallet") +
+                       ".";
+
                Self::proposals()
                        .into_iter()
                        .try_for_each(|proposal| -> Result<(), TryRuntimeError> {
@@ -1041,10 +1046,10 @@ impl<T: Config<I>, I: 'static> Pallet<T, I> {
                        "The member count is greater than `MaxMembers`."
                );

-               ensure!(
-                       Self::members().windows(2).all(|members| members[0] <= members[1]),
-                       "The members are not sorted by value."
-               );
+               if !Self::members().windows(2).all(|members| members[0] <= members[1]) {
+                       log::error!("{}", &pallet_error_log);
+                       return Err("The members are not sorted by value.".into())
+               };		

Solutions?

Replace ensure with conditionals and additional logging

The simplest way to resolve this is likely to simply replace all the ensure usage with conditionals, and an extra log containing the pallet name (optionally also index) like I did when investigating the collectives pallet issue.

This solution kind of sucks, making the code verbose.

Change try_state to return a new error type

I may be wrong but I'm not sure if it even makes sense for try_state to return a DispatchError, because it only ever checks storage and never actually dispatches anything?

Something very roughly like

// If you want P to be something that can be converted into a String or has some relation to String
#[cfg(feature = "try-runtime")]
#[derive(Eq, Clone, Copy, Encode, Debug, PartialEq)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
pub struct TryStateError<P: PalletInfo>(pub P, pub &'static str);

This way, Executive will have access to P as well as the error message and be able to log which pallet it came from.

@liamaharon liamaharon changed the title try-state: unclear errors try-state: improve offending pallet in error Aug 10, 2023
@liamaharon liamaharon changed the title try-state: improve offending pallet in error try-state: include pallet in error Aug 10, 2023
@liamaharon liamaharon changed the title try-state: include pallet in error try-state: include some pallet identifier in error Aug 10, 2023
@liamaharon
Copy link
Contributor Author

@kianenigma I think you are likely best qualified to provide valuable insights about this :)

@liamaharon liamaharon changed the title try-state: include some pallet identifier in error try-state: include offending pallet identifier in error Aug 10, 2023
@liamaharon liamaharon changed the title try-state: include offending pallet identifier in error try-state: should indicate which pallet is failing Aug 10, 2023
@juangirini juangirini transferred this issue from paritytech/substrate Aug 24, 2023
@the-right-joyce the-right-joyce added I5-enhancement An additional feature request. T1-FRAME This PR/Issue is related to core FRAME, the framework. and removed J0-enhancement labels Aug 25, 2023
liamaharon added a commit that referenced this issue Oct 30, 2023
Making some devex improvements as I audit our chains adherence to
try-state invariants, in preparation for automated try-state checks and
alerting.

Note to reviewer: while you're here, if you have time would be great to
get your eyes on #1297
also since it touches a similar file and I'd like to avoid merge
conflicts :P

## Devex Improvements

- Changes the log level of logs informing the user that try-state checks
are being run for a pallet from debug to info
- Improves how errors are communicated
- Errors are logged when they are encountered, rather than after
everything has been executed
- Exact pallet the error originated from is included with the error log
  - Clearly see all errors and how many there are, rather than only one
  - Closes #136 

### Example of new logs

<img width="1185" alt="Screenshot 2023-10-25 at 15 44 44"
src="https://github.com/paritytech/polkadot-sdk/assets/16665596/b75588a2-1c64-45df-bbc8-bcb8bf8b0fe0">

### Same but with old logs (run with RUST_LOG=debug)

Notice only informed of one of the errors, and it's unclear which pallet
it originated

<img width="1185" alt="Screenshot 2023-10-25 at 15 39 01"
src="https://github.com/paritytech/polkadot-sdk/assets/16665596/e3429cb1-489e-430a-9716-77c052e5dae6">
 

## Bug fix

When dry-running migrations and `checks.try_state()` is `true`, only run
`try_state` checks after migrations have been executed. Otherwise,
`try_state` checks that expect state to be in at a HIGHER storage
version than is on-chain could incorrectly fail.

---------

Co-authored-by: command-bot <>
@github-project-automation github-project-automation bot moved this from Backlog to Done in Runtime / FRAME Oct 30, 2023
bgallois pushed a commit to duniter/duniter-polkadot-sdk that referenced this issue Mar 25, 2024
Making some devex improvements as I audit our chains adherence to
try-state invariants, in preparation for automated try-state checks and
alerting.

Note to reviewer: while you're here, if you have time would be great to
get your eyes on paritytech#1297
also since it touches a similar file and I'd like to avoid merge
conflicts :P

## Devex Improvements

- Changes the log level of logs informing the user that try-state checks
are being run for a pallet from debug to info
- Improves how errors are communicated
- Errors are logged when they are encountered, rather than after
everything has been executed
- Exact pallet the error originated from is included with the error log
  - Clearly see all errors and how many there are, rather than only one
  - Closes paritytech#136 

### Example of new logs

<img width="1185" alt="Screenshot 2023-10-25 at 15 44 44"
src="https://github.com/paritytech/polkadot-sdk/assets/16665596/b75588a2-1c64-45df-bbc8-bcb8bf8b0fe0">

### Same but with old logs (run with RUST_LOG=debug)

Notice only informed of one of the errors, and it's unclear which pallet
it originated

<img width="1185" alt="Screenshot 2023-10-25 at 15 39 01"
src="https://github.com/paritytech/polkadot-sdk/assets/16665596/e3429cb1-489e-430a-9716-77c052e5dae6">
 

## Bug fix

When dry-running migrations and `checks.try_state()` is `true`, only run
`try_state` checks after migrations have been executed. Otherwise,
`try_state` checks that expect state to be in at a HIGHER storage
version than is on-chain could incorrectly fail.

---------

Co-authored-by: command-bot <>
lexnv added a commit that referenced this issue Apr 3, 2024
chainHead/follow: Provide multiple finalized block hashes by the `Initialized` event
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024
* runtime benchmarks: start

* merge tests + benchmarks infrastructure

* fix compilation

* Fix compilation issues with runtime-benchmark feature flag

Mainly involved pulling in correct dependencies and adding some functions
which were called but didn't yet exist.

* Fix broken compilation for tests

* Move header signing methods into trait

* Move signing related test helpers to own module

* Remove comment about feature flag

* Add constants to tests

* Add top level comment for testing utilities

Co-authored-by: Hernando Castano <castano.ha@gmail.com>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024
* runtime benchmarks: start

* merge tests + benchmarks infrastructure

* fix compilation

* Fix compilation issues with runtime-benchmark feature flag

Mainly involved pulling in correct dependencies and adding some functions
which were called but didn't yet exist.

* Fix broken compilation for tests

* Move header signing methods into trait

* Move signing related test helpers to own module

* Remove comment about feature flag

* Add constants to tests

* Add top level comment for testing utilities

Co-authored-by: Hernando Castano <castano.ha@gmail.com>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* runtime benchmarks: start

* merge tests + benchmarks infrastructure

* fix compilation

* Fix compilation issues with runtime-benchmark feature flag

Mainly involved pulling in correct dependencies and adding some functions
which were called but didn't yet exist.

* Fix broken compilation for tests

* Move header signing methods into trait

* Move signing related test helpers to own module

* Remove comment about feature flag

* Add constants to tests

* Add top level comment for testing utilities

Co-authored-by: Hernando Castano <castano.ha@gmail.com>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* runtime benchmarks: start

* merge tests + benchmarks infrastructure

* fix compilation

* Fix compilation issues with runtime-benchmark feature flag

Mainly involved pulling in correct dependencies and adding some functions
which were called but didn't yet exist.

* Fix broken compilation for tests

* Move header signing methods into trait

* Move signing related test helpers to own module

* Remove comment about feature flag

* Add constants to tests

* Add top level comment for testing utilities

Co-authored-by: Hernando Castano <castano.ha@gmail.com>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* runtime benchmarks: start

* merge tests + benchmarks infrastructure

* fix compilation

* Fix compilation issues with runtime-benchmark feature flag

Mainly involved pulling in correct dependencies and adding some functions
which were called but didn't yet exist.

* Fix broken compilation for tests

* Move header signing methods into trait

* Move signing related test helpers to own module

* Remove comment about feature flag

* Add constants to tests

* Add top level comment for testing utilities

Co-authored-by: Hernando Castano <castano.ha@gmail.com>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* runtime benchmarks: start

* merge tests + benchmarks infrastructure

* fix compilation

* Fix compilation issues with runtime-benchmark feature flag

Mainly involved pulling in correct dependencies and adding some functions
which were called but didn't yet exist.

* Fix broken compilation for tests

* Move header signing methods into trait

* Move signing related test helpers to own module

* Remove comment about feature flag

* Add constants to tests

* Add top level comment for testing utilities

Co-authored-by: Hernando Castano <castano.ha@gmail.com>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* runtime benchmarks: start

* merge tests + benchmarks infrastructure

* fix compilation

* Fix compilation issues with runtime-benchmark feature flag

Mainly involved pulling in correct dependencies and adding some functions
which were called but didn't yet exist.

* Fix broken compilation for tests

* Move header signing methods into trait

* Move signing related test helpers to own module

* Remove comment about feature flag

* Add constants to tests

* Add top level comment for testing utilities

Co-authored-by: Hernando Castano <castano.ha@gmail.com>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024
* runtime benchmarks: start

* merge tests + benchmarks infrastructure

* fix compilation

* Fix compilation issues with runtime-benchmark feature flag

Mainly involved pulling in correct dependencies and adding some functions
which were called but didn't yet exist.

* Fix broken compilation for tests

* Move header signing methods into trait

* Move signing related test helpers to own module

* Remove comment about feature flag

* Add constants to tests

* Add top level comment for testing utilities

Co-authored-by: Hernando Castano <castano.ha@gmail.com>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 10, 2024
* runtime benchmarks: start

* merge tests + benchmarks infrastructure

* fix compilation

* Fix compilation issues with runtime-benchmark feature flag

Mainly involved pulling in correct dependencies and adding some functions
which were called but didn't yet exist.

* Fix broken compilation for tests

* Move header signing methods into trait

* Move signing related test helpers to own module

* Remove comment about feature flag

* Add constants to tests

* Add top level comment for testing utilities

Co-authored-by: Hernando Castano <castano.ha@gmail.com>
serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 10, 2024
* runtime benchmarks: start

* merge tests + benchmarks infrastructure

* fix compilation

* Fix compilation issues with runtime-benchmark feature flag

Mainly involved pulling in correct dependencies and adding some functions
which were called but didn't yet exist.

* Fix broken compilation for tests

* Move header signing methods into trait

* Move signing related test helpers to own module

* Remove comment about feature flag

* Add constants to tests

* Add top level comment for testing utilities

Co-authored-by: Hernando Castano <castano.ha@gmail.com>
jonathanudd pushed a commit to jonathanudd/polkadot-sdk that referenced this issue Apr 10, 2024
* runtime benchmarks: start

* merge tests + benchmarks infrastructure

* fix compilation

* Fix compilation issues with runtime-benchmark feature flag

Mainly involved pulling in correct dependencies and adding some functions
which were called but didn't yet exist.

* Fix broken compilation for tests

* Move header signing methods into trait

* Move signing related test helpers to own module

* Remove comment about feature flag

* Add constants to tests

* Add top level comment for testing utilities

Co-authored-by: Hernando Castano <castano.ha@gmail.com>
claravanstaden added a commit to claravanstaden/polkadot-sdk that referenced this issue Sep 19, 2024
Co-authored-by: claravanstaden <Cats 4 life!>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I5-enhancement An additional feature request. T1-FRAME This PR/Issue is related to core FRAME, the framework.
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants