Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: Scraped backing votes had invalid signature #3878

Closed
lexnv opened this issue Mar 28, 2024 · 7 comments
Closed

panic: Scraped backing votes had invalid signature #3878

lexnv opened this issue Mar 28, 2024 · 7 comments

Comments

@lexnv
Copy link
Contributor

lexnv commented Mar 28, 2024

The following was discovered during versi-network triage versi-net logs:

2024-03-27 09:29:00.322 TRACE tokio-runtime-worker sub-libp2p: /b418d6efa0feb93b5c4b5113f1f3c6de473d95c0a925a442e43a3b7da12c181f/validation/3: notification received from PeerId("12D3KooWSRuZX8MxcGMS561mJomD3GPGdzAB5WyDhKA4HYKUAV7Z")    

====================

Version: 1.9.0-d88ba6d3c6c

   0: sp_panic_handler::panic_hook
             at builds/parity/mirrors/polkadot-sdk/substrate/primitives/panic-handler/src/lib.rs:166:18
      sp_panic_handler::set::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/substrate/primitives/panic-handler/src/lib.rs:62:12
   1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/boxed.rs:2021:9
      std::panicking::rust_panic_with_hook
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:783:13
   2: std::panicking::begin_panic_handler::{{closure}}
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:657:13
   3: std::sys_common::backtrace::__rust_end_short_backtrace
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:170:18
   4: rust_begin_unwind
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645:5
   5: core::panicking::panic_fmt
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
   6: polkadot_node_core_dispute_coordinator::initialized::Initialized::process_on_chain_votes::{{closure}}::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/polkadot/node/core/dispute-coordinator/src/initialized.rs:650:6
      core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:294:13
      core::iter::traits::iterator::Iterator::find_map::check::{{closure}}
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/iter/traits/iterator.rs:2958:32
      core::iter::traits::iterator::Iterator::try_fold
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/iter/traits/iterator.rs:2462:21
      core::iter::traits::iterator::Iterator::find_map
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/iter/traits/iterator.rs:2964:9
      <core::iter::adapters::filter_map::FilterMap<I,F> as core::iter::traits::iterator::Iterator>::next
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/iter/adapters/filter_map.rs:62:9
   7: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/vec/spec_from_iter_nested.rs:26:32
      alloc::vec::in_place_collect::<impl alloc::vec::spec_from_iter::SpecFromIter<T,I> for alloc::vec::Vec<T>>::from_iter
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/vec/in_place_collect.rs:167:20
   8: <alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/vec/mod.rs:2753:9
      core::iter::traits::iterator::Iterator::collect
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/iter/traits/iterator.rs:2054:9
      polkadot_node_core_dispute_coordinator::initialized::Initialized::process_on_chain_votes::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/polkadot/node/core/dispute-coordinator/src/initialized.rs:626:21
   9: polkadot_node_core_dispute_coordinator::initialized::Initialized::process_chain_import_backlog::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/polkadot/node/core/dispute-coordinator/src/initialized.rs:570:83
  10: polkadot_node_core_dispute_coordinator::initialized::Initialized::process_active_leaves_update::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/polkadot/node/core/dispute-coordinator/src/initialized.rs:382:6
      polkadot_node_core_dispute_coordinator::initialized::Initialized::run_until_error::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/polkadot/node/core/dispute-coordinator/src/initialized.rs:269:9
  11: polkadot_node_core_dispute_coordinator::initialized::Initialized::run::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/polkadot/node/core/dispute-coordinator/src/initialized.rs:169:78
      polkadot_node_core_dispute_coordinator::DisputeCoordinatorSubsystem::run::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/polkadot/node/core/dispute-coordinator/src/lib.rs:191:5
      <polkadot_node_core_dispute_coordinator::DisputeCoordinatorSubsystem as orchestra::Subsystem<Context,polkadot_node_subsystem_types::errors::SubsystemError>>::start::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/polkadot/node/core/dispute-coordinator/src/lib.rs:150:6
  12: <core::pin::Pin<P> as core::future::future::Future>::poll
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/future/future.rs:125:9
      polkadot_overseer::spawn::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/polkadot/node/overseer/src/lib.rs:460:1
  13: <core::pin::Pin<P> as core::future::future::Future>::poll
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/future/future.rs:125:9
      <sc_service::task_manager::prometheus_future::PrometheusFuture<T> as core::future::future::Future>::poll
             at builds/parity/mirrors/polkadot-sdk/substrate/client/service/src/task_manager/prometheus_future.rs:61:3
  14: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::future::future::Future>::poll
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panic/unwind_safe.rs:297:9
      <futures_util::future::future::catch_unwind::CatchUnwind<Fut> as core::future::future::Future>::poll::{{closure}}
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/future/catch_unwind.rs:36:42
      <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panic/unwind_safe.rs:272:9
      std::panicking::try::do_call
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
      std::panicking::try
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
      std::panic::catch_unwind
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
      <futures_util::future::future::catch_unwind::CatchUnwind<Fut> as core::future::future::Future>::poll
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/future/catch_unwind.rs:36:9
      <core::pin::Pin<P> as core::future::future::Future>::poll
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/future/future.rs:125:9
      futures_util::future::future::FutureExt::poll_unpin
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/future/mod.rs:558:9
      <futures_util::future::select::Select<A,B> as core::future::future::Future>::poll
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/select.rs:118:35
  15: sc_service::task_manager::SpawnTaskHandle::spawn_inner::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/substrate/client/service/src/task_manager/mod.rs:168:33
      <tracing_futures::Instrumented<T> as core::future::future::Future>::poll
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-futures-0.2.5/src/lib.rs:283:9
  16: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/park.rs:282:63
      tokio::runtime::coop::with_budget
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/coop.rs:107:5
      tokio::runtime::coop::budget
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/coop.rs:73:5
      tokio::runtime::park::CachedParkThread::block_on
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/park.rs:282:31
  17: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/context/blocking.rs:66:9
      tokio::runtime::handle::Handle::block_on::{{closure}}
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/handle.rs:310:13
      tokio::runtime::context::runtime::enter_runtime
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/context/runtime.rs:65:16
  18: tokio::runtime::handle::Handle::block_on
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/handle.rs:309:9
      sc_service::task_manager::SpawnTaskHandle::spawn_inner::{{closure}}
             at builds/parity/mirrors/polkadot-sdk/substrate/client/service/src/task_manager/mod.rs:204:13
      <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/blocking/task.rs:42:21
  19: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/task/core.rs:328:17
      tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/loom/std/unsafe_cell.rs:16:9
      tokio::runtime::task::core::Core<T,S>::poll
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/task/core.rs:317:13
  20: tokio::runtime::task::harness::poll_future::{{closure}}
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/task/harness.rs:485:19
      <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panic/unwind_safe.rs:272:9
      std::panicking::try::do_call
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
      std::panicking::try
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
      std::panic::catch_unwind
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
      tokio::runtime::task::harness::poll_future
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/task/harness.rs:473:18
      tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/task/harness.rs:208:27
             
             
             
             11:29:00.370	
      tokio::runtime::task::harness::Harness<T,S>::poll
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/task/harness.rs:153:15
  21: tokio::runtime::task::raw::RawTask::poll
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/task/raw.rs:200:18
      tokio::runtime::task::UnownedTask<S>::run
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/task/mod.rs:445:9
      tokio::runtime::blocking::pool::Task::run
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/blocking/pool.rs:159:9
      tokio::runtime::blocking::pool::Inner::run
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/blocking/pool.rs:513:17
  22: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
             at usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.33.0/src/runtime/blocking/pool.rs:471:13
      std::sys_common::backtrace::__rust_begin_short_backtrace
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:154:18
  23: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/thread/mod.rs:529:17
      <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panic/unwind_safe.rs:272:9
      std::panicking::try::do_call
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
      std::panicking::try
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
      std::panic::catch_unwind
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
      std::thread::Builder::spawn_unchecked_::{{closure}}
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/thread/mod.rs:528:30
      core::ops::function::FnOnce::call_once{{vtable.shim}}
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:250:5
  24: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/boxed.rs:2007:9
      <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/boxed.rs:2007:9
      std::sys::unix::thread::Thread::new::thread_start
             at rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys/unix/thread.rs:108:17
  25: start_thread
  26: clone


Thread 'tokio-runtime-worker' panicked at 'Scraped backing votes had invalid signature! candidate: 0xf38e680a8e4c17be39fb28e8c8cb4dc06d347c2295898d45873beba127402e1b, session: 9562, validator_public: Public(4a139a3d919f44ba4709d65c9bd5e1c3c9392b4572e7e1cb28c73bd7591ad51c (12g8NGLS...)), validator_index: 8', /builds/parity/mirrors/polkadot-sdk/polkadot/node/core/dispute-coordinator/src/initialized.rs:650

This is a bug. Please report it at:

	https://github.com/paritytech/polkadot-sdk/issues/new

Version: #2944

  • using default libp2p backend
  • using master of: 5ac32ee

Extra logs during the incident

2024-03-27 09:29:02.097 ERROR tokio-runtime-worker parachain::chain-selection: Call to `DetermineUndisputedChain` failed error=DetermineUndisputedChainCanceled(Canceled)

2024-03-27 09:29:02.096  WARN tokio-runtime-worker parachain::dispute-coordinator: Received msg before first active leaves update. This is not expected - message will be dropped. msg=DetermineUndisputedChain { base: (94832, 0x5b8a3bca37364e715050b64279bb9d386b6fa34bd96cb2ef0412a273e8701513), block_descriptions: [BlockDescription { block_hash: 0xb5a0f6f0010e340e67dd7f5f02776f91e449baf8f54497258178eaf62116579f, session: 9561, candidates: [] }, BlockDescription { block_hash: 0xd1e93dfaace4a50d85562b86b542a200768bc711bdc152ed07602676a220fade, session: 9561, candidates: [0xd9b906383db54b9fafcc2c7b8d7a6045796d02e995f90e743dc807f5aa3cc59c] }, BlockDescription { block_hash: 0x02515f1e0fbebf29bc689c86c264d11ffd85cc42ccccd1ebeb6e32855b41d653, session: 9561, candidates: [0x8db497b2c12f37c62cd33d25eb2b3de1941ec2d4a2d8d34467ad2a34e06f00e9] }], tx: Sender { complete: false } }

cc @paritytech/networking @alexggh

@tdimitrov
Copy link
Contributor

tdimitrov commented Mar 28, 2024

The backtrace shows that a "Valid" backing vote has got an invalid signature:

I'm not sure how this can happen. What were you testing? Anything off that can cause such behaviour?

@lexnv
Copy link
Contributor Author

lexnv commented Mar 28, 2024

I was testing #2944 with the default LibP2P network backend (not the LiteP2P).

There might be something off with the network format, but I doubt it is coming from that.

We may have missed these logs over time.

To discover it, I created a query that filters lines that contain panic. However, that is a tedious manual process for many logs, being constrained by a query window of 30 minutes otherwise we would time out.

@ordian
Copy link
Member

ordian commented Mar 28, 2024

Scraped backing votes had invalid signature!

Last time I remember this happening (and the only explanation I got for it) is when there was a dispute and a reversion to a block at least 2 sessions ago. The reason for this is that we cache SessionInfo which includes validator public keys per session index and if we do a past session reversion, technically, this caching is not sound: #633.

I wonder if this was the case here or something else? If yes, this is a known issue that is very unlikely to happen in practice (as it requires commiting to an invalid parachain block and risking getting 100% * 2 of validator stake slashed AND stalling finality for a couple of days/many hours on Polkadot/Kusama) and no need to worry about it.

@alexggh
Copy link
Contributor

alexggh commented Mar 28, 2024

I think @lexnv mentioned a lot of:

tokio-runtime-worker grandpa: Re-finalized block #0xbb4ccc7305db7673fbbe0fc2f0216fc4a3cab102438d2c005dde7ea6965bc060 (92148) in the canonical chain, current best finalized is #92148

So I guess it might be what you mentioned @ordian .

@burdges
Copy link

burdges commented Mar 29, 2024

We've discussed that problem lots, not sure if we ever had really great answers, but not participating in parachains when our session info differs maybe doable, assuming we can tell what session info others use.

We then risk attacks where attackers try breaking validator's sesision info, so that must be treated as a soundness risk, but we must do similar anyways. It's probably also broken if attackers can make them flood the system with disputes, and then do attacks while everything is limping along with no validators.

@bkchr
Copy link
Member

bkchr commented Apr 2, 2024

So we can close this issue?

@lexnv
Copy link
Contributor Author

lexnv commented Apr 5, 2024

This issue was most probably related to our on-spot test instances.
After switching to on-demand instances, the issue did not reproduce for a few days.

Closing this for now, thanks!

@lexnv lexnv closed this as completed Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants