Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing-preview and testing-sanchonet aggregators panic with FOREIGN KEY constraint failed error #2120

Closed
5 tasks done
jpraynaud opened this issue Nov 18, 2024 · 1 comment
Assignees
Labels
bug ⚠️ Something isn't working

Comments

@jpraynaud
Copy link
Member

jpraynaud commented Nov 18, 2024

Why

The testing-preview and testing-sanchonet networks are down with the following panic:

{"msg":">> open_signer_registration_round","v":0,"name":"mithril-aggregator","level":20,"time":"2024-11-17T15:02:28.980533893Z","hostname":"7e7d34d1d6cc","pid":1,"src":"AggregatorRunner","time_point":"TimePoint {\n    epoch: Epoch(\n        521,\n    ),\n    immutable_file_number: 10429,\n    chain_point: ChainPoint {\n        slot_number: SlotNumber(\n            45066688,\n        ),\n        block_number: BlockNumber(\n            2242439,\n        ),\n        block_hash: \"9dd84673cb69436be7db81ec55499f012d3a0d339f727ee8b558dccc4777cfea\",\n    },\n}"}
thread 'tokio-runtime-worker' panicked at /home/runner/work/mithril/mithril/internal/mithril-persistence/src/sqlite/cursor.rs:35:51:
FOREIGN KEY constraint failed (code 19)
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: <mithril_persistence::sqlite::cursor::EntityCursor<T> as core::iter::traits::iterator::Iterator>::next
   3: mithril_persistence::sqlite::connection_extensions::ConnectionExtensions::fetch_first
   4: <mithril_aggregator::database::repository::signer_registration_store::SignerRegistrationStore as mithril_aggregator::store::verification_key_store::VerificationKeyStorer>::prune_verification_keys::{{closure}}
   5: <mithril_aggregator::signer_registerer::MithrilSignerRegisterer as mithril_aggregator::signer_registerer::SignerRegistrationRoundOpener>::open_registration_round::{{closure}}
   6: <mithril_aggregator::runtime::runner::AggregatorRunner as mithril_aggregator::runtime::runner::AggregatorRunnerTrait>::open_signer_registration_round::{{closure}}
   7: mithril_aggregator::runtime::state_machine::AggregatorRuntime::cycle::{{closure}}
   8: mithril_aggregator::commands::serve_command::ServeCommand::execute::{{closure}}::{{closure}}
   9: tokio::runtime::task::core::Core<T,S>::poll
  10: tokio::runtime::task::harness::Harness<T,S>::poll
  11: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  12: tokio::runtime::scheduler::multi_thread::worker::Context::run
  13: tokio::runtime::context::runtime::enter_runtime
  14: tokio::runtime::scheduler::multi_thread::worker::run
  15: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
  16: tokio::runtime::task::core::Core<T,S>::poll
  17: tokio::runtime::task::harness::Harness<T,S>::poll
  18: tokio::runtime::blocking::pool::Inner::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
{"msg":"shutting down HTTP server after receiving signal","v":0,"name":"mithril-aggregator","level":40,"time":"2024-11-17T15:02:28.983355811Z","hostname":"7e7d34d1d6cc","pid":1,"src":"MetricsServer"}
Error: task 14 panicked with message "FOREIGN KEY constraint failed (code 19)"

Stack backtrace:
   0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from
   1: mithril_aggregator::commands::serve_command::ServeCommand::execute::{{closure}}
   2: tokio::runtime::park::CachedParkThread::block_on
   3: mithril_aggregator::main
   4: std::sys::backtrace::__rust_begin_short_backtrace
   5: std::rt::lang_start::{{closure}}
   6: std::rt::lang_start_internal
   7: main
   8: __libc_start_main
   9: _start

What

Investigate and fix the problem that prevents the aggregator from starting in testing-preview and testing-sanchonet.

How

  • Investigate the issue
  • Fix the issue
  • Re-genesis with genesis bootstrap command the test networks (1 epoch after the fix is applied):
    • testing-preview
    • testing-sanchonet
@jpraynaud jpraynaud added the bug ⚠️ Something isn't working label Nov 18, 2024
@jpraynaud
Copy link
Member Author

Following a problem with #1957, we have made a manual operation on the database with sqlite3.
We have deleted some open messages, but the delete cascade was not executed as this is not a feature activated by default with this tool.

The pruning of the verification keys failed because some single signatures associated with the aforementioned open messages were not properly deleted and created the constraint error which triggered a panic of the node.

We have manually deleted the remaining single signatures and the aggregators of testing-preview and testing-sanchonet have resumed successfully.

@jpraynaud jpraynaud self-assigned this Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug ⚠️ Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant