Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config reload results in panic #1885

Closed
5 tasks
adizere opened this issue Feb 17, 2022 · 1 comment · Fixed by #1894
Closed
5 tasks

Config reload results in panic #1885

adizere opened this issue Feb 17, 2022 · 1 comment · Fixed by #1894
Assignees
Labels
I: CLI Internal: related to the relayer's CLI I: configuration Internal: related to Hermes configuration
Milestone

Comments

@adizere
Copy link
Member

adizere commented Feb 17, 2022

Summary of Bug

It seems that the configuration reload functionality in Hermes does not work. If triggered, it results into a panic and the process quits.

Version

hermes 0.11.1+5ddec6d2

Steps to Reproduce

  1. do hermes start, make sure you have at least 2 chains
    • in my setup, a few tasks related to client refresh & misbehavior detection are starting up (see logs below)
    • let hermes start up completely, until " Hermes has started" log appears
  2. in a separate terminal, do 'kill -SIGHUP pgrep hermes'
    • no need to actually change anything in the configuration file to provoke the panic
2022-02-17T10:01:36.162478Z  INFO ThreadId(01) Hermes has started
2022-02-17T10:01:50.999784Z  INFO ThreadId(1123) reloading configuration (triggered by SIGHUP)
2022-02-17T10:01:51.002623Z  INFO ThreadId(1123) configuration successfully reloaded
2022-02-17T10:01:51.275033Z  INFO ThreadId(1122) cmd: updating existing chain chain=ibc-0
2022-02-17T10:01:51.275410Z  INFO ThreadId(1122) cmd: removing existing chain chain=ibc-0
2022-02-17T10:01:51.275808Z DEBUG ThreadId(1122) cmd: shutting down workers chain=ibc-0
2022-02-17T10:01:51.674491Z DEBUG ThreadId(51) DetectMisbehaviorWorker{client=07-tendermint-0 src_chain=ibc-1 dst_chain=ibc-0}: task terminated
2022-02-17T10:01:52.289031Z DEBUG ThreadId(20) refresh{client=07-tendermint-0 src_chain=ibc-1 dst_chain=ibc-0}: task terminated
2022-02-17T10:01:52.480667Z DEBUG ThreadId(269) DetectMisbehaviorWorker{client=07-tendermint-1 src_chain=ibc-1 dst_chain=ibc-0}: task terminated
2022-02-17T10:01:52.759909Z DEBUG ThreadId(52) refresh{client=07-tendermint-1 src_chain=ibc-1 dst_chain=ibc-0}: task terminated
2022-02-17T10:01:52.822388Z DEBUG ThreadId(577) DetectMisbehaviorWorker{client=07-tendermint-2 src_chain=ibc-1 dst_chain=ibc-0}: task terminated
2022-02-17T10:01:53.010916Z DEBUG ThreadId(270) refresh{client=07-tendermint-2 src_chain=ibc-1 dst_chain=ibc-0}: task terminated
2022-02-17T10:01:53.088293Z DEBUG ThreadId(591) DetectMisbehaviorWorker{client=07-tendermint-0 src_chain=ibc-0 dst_chain=ibc-1}: task terminated
2022-02-17T10:01:53.472498Z DEBUG ThreadId(578) refresh{client=07-tendermint-0 src_chain=ibc-0 dst_chain=ibc-1}: task terminated
2022-02-17T10:01:53.604438Z DEBUG ThreadId(827) DetectMisbehaviorWorker{client=07-tendermint-1 src_chain=ibc-0 dst_chain=ibc-1}: task terminated
2022-02-17T10:01:53.807629Z DEBUG ThreadId(592) refresh{client=07-tendermint-1 src_chain=ibc-0 dst_chain=ibc-1}: task terminated
2022-02-17T10:01:54.219261Z DEBUG ThreadId(1117) DetectMisbehaviorWorker{client=07-tendermint-2 src_chain=ibc-0 dst_chain=ibc-1}: task terminated
2022-02-17T10:01:54.242528Z DEBUG ThreadId(828) refresh{client=07-tendermint-2 src_chain=ibc-0 dst_chain=ibc-1}: task terminated
2022-02-17T10:01:54.242981Z DEBUG ThreadId(1122) cmd: shutting down chain runtime chain=ibc-0
2022-02-17T10:01:54.243718Z  INFO ThreadId(1122) cmd: adding new chain chain=ibc-0
2022-02-17T10:01:54.243973Z DEBUG ThreadId(1122) cmd: spawning chain runtime chain=ibc-0
thread '<unnamed>' panicked at 'rwlock read lock would result in deadlock', /rustc/02072b482a8b5357f7fb5e5637444ae30e423c40/library/std/src/sys/unix/rwlock.rs:49:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at 'poisoned lock: PoisonError { .. }', /Users/adi/Hammers/ibc-rs/relayer/src/util/lock.rs:30:21

Acceptance Criteria

Unclear. I am leaning towards completely disabling/deprecating the config reload functionality. The reason is that it's not used at the moment, and maintaining it has a certain cost & not convinced it's worth it. It seems that users have insufficient confidence in either the full node software of Hermes to be relying on this "soft" reload functionality, and instead most of them kill the Hermes process (often for the full node also) via their operational tools (eg kubernetes) to restart it.

I think we should re-asses if the feature is used anywhere and potentially re-enable it down the line. At the moment, it may be best to:

  • disable the config reload feature,
  • throw a warning upon catching the SIGHUP signal perhaps
  • update the guide to mention that this feature is deprecated (as of v0.12.1) and pending reconsideration

For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate milestone (priority) applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@adizere adizere added I: CLI Internal: related to the relayer's CLI I: configuration Internal: related to Hermes configuration P-high labels Feb 17, 2022
@adizere adizere added this to the v0.12.1 milestone Feb 17, 2022
@romac
Copy link
Member

romac commented Feb 17, 2022

Agreed that we can now get rid of that feature since we've improved startup time substantially over the past few releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I: CLI Internal: related to the relayer's CLI I: configuration Internal: related to Hermes configuration
Projects
No open projects
Status: Closed
Development

Successfully merging a pull request may close this issue.

3 participants