-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Tokio Console support #517
Allow Tokio Console support #517
Conversation
74c01ca
to
16cc8e1
Compare
add `tracing` feature of tokio, and the console-subscriber
add necessary logs; add timeout in `read_message_from_stellar` because it gets stuck
db588a5
to
87ad4d5
Compare
@@ -7,7 +7,7 @@ edition = "2021" | |||
[dependencies] | |||
clap = { version = "4.0.17", features = ["derive"]} | |||
hex = "0.4.3" | |||
tokio = { version = "1.8", features = ["rt-multi-thread", "macros", "time"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
latest tokio vresion is 1.37.
@@ -31,7 +31,7 @@ log = "0.4.0" | |||
serde = { version = "1.0.136", features = ["derive"] } | |||
serde_json = "1.0.71" | |||
thiserror = "1.0" | |||
tokio = { version = "1.0", features = ["full"] } | |||
tokio = { version = "1.37", features = ["full"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this is the "lost its waker" issue in tokio, fixed in the newer versions.
@@ -16,7 +16,7 @@ wallet = { path = "../wallet", features = ["testing-utils"] } | |||
|
|||
[dependencies] | |||
hex = "0.4.3" | |||
log = {version = "0.4.14"} | |||
tracing = { version = "0.1", features = ["log"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helps with tokio-console and instrumentation:
the console-subscriber crate in this repository contains an implementation of the instrumentation-side API as a tracing-subscriber Layer, for projects using Tokio and tracing.
.cargo/config.toml
Outdated
@@ -0,0 +1,2 @@ | |||
[build] | |||
rustflags = ["--cfg", "tokio_unstable"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to make tokio-console
work::
in order to collect task data from Tokio, the tokio_unstable cfg must be enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the Rust stable version, there hasn't been any problems. The tokio-console
command displays just fine, although I was not able to see the histogram.
The worst it could do would be returning a blank view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work for me, unfortunately I have to explicitly pass the RUSTFLAGS="--cfg tokio_unstable"
when trying to run or build the client.
I assume it's because I do have a ~/.cargo/config.toml
file where I already specify some rust flags
[target.aarch64-apple-darwin]
rustflags = ["-C", "link-arg=-fuse-ld=/opt/homebrew/opt/llvm/bin/ld64.lld"]
and this might override the workspace configuration.
I confirmed this by removing my 'global' config.toml and then recompiling. This worked.
Generally I would prefer if we could enable this tokio-console
thing with an extra feature flag and have it disabled by default. Do you think this would be possible @b-yap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See e77d7ac
I tried to set it through [env]
.cargo/config.toml, and it didn't work. All RUSTFLAGS
, CARGO_ENCODED_RUSTFLAGS
and CARGO_BUILD_FLAGS
won't work.
I also tried to do it through a build script and didn't work:
fn main(){
cfg_if::cfg_if! {
if #[cfg(feature = "allow-debugger")] {
println!("cargo:rustc-env=RUSTFLAGS=\"--cfg tokio_unstable\"");
}
}
}
and
println!("cargo:rustc-flags=\"--cfg tokio_unstable\"");
as explained in rust-lang/cargo#10141.
In the end I updated the doc to tell the user to set the rustflags themselves.
"verify_auth(): remote sequence: {}, auth message sequence: {}", | ||
remote_info.sequence(), | ||
auth_msg.sequence | ||
); | ||
|
||
let auth_msg_xdr = auth_msg.to_base64_xdr(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to remove this because the xdr was huge and may not be helpful at all.
other => { | ||
log::trace!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are only interested on messages that will be processed by the vault.
Logging should be done over there than here.
@@ -147,6 +147,7 @@ async fn start() -> Result<(), ServiceError<Error>> { | |||
|
|||
#[tokio::main] | |||
async fn main() { | |||
console_subscriber::init(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add documentation of the `tokio-console`.
@@ -679,15 +682,6 @@ impl VaultService { | |||
|
|||
tasks.append(&mut replace_tasks); | |||
|
|||
tasks.push(( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😱 removing this duplicate, as it is also found in
spacewalk/clients/vault/src/system.rs
Lines 623 to 630 in e7a672b
( | |
"Parachain Block Listener", | |
run(active_block_listener( | |
self.spacewalk_parachain.clone(), | |
issue_event_tx, | |
replace_event_tx, | |
)), | |
), |
inside
fn create_initial_tasks(...
Originally it runs after Replace Cancellation Scheduler
and before Redeem Request Listener
:
spacewalk/clients/vault/src/system.rs
Lines 619 to 639 in 31df183
), | |
( | |
"Parachain Block Listener", | |
run(active_block_listener( | |
self.spacewalk_parachain.clone(), | |
issue_event_tx.clone(), | |
replace_event_tx.clone(), | |
)), | |
), | |
( | |
"Redeem Request Listener", | |
run(listen_for_redeem_requests( | |
self.shutdown.clone(), | |
self.spacewalk_parachain.clone(), | |
self.vault_id_manager.clone(), | |
self.config.payment_margin_minutes, | |
oracle_agent.clone(), | |
)), | |
), | |
( | |
"Bridge Metrics Listener", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me overall, nice job and great feature @b-yap 🚀
I left some comments.
clients/stellar-relay-lib/src/connection/connector/message_sender.rs
Outdated
Show resolved
Hide resolved
clients/stellar-relay-lib/src/connection/connector/message_sender.rs
Outdated
Show resolved
Hide resolved
.cargo/config.toml
Outdated
@@ -0,0 +1,2 @@ | |||
[build] | |||
rustflags = ["--cfg", "tokio_unstable"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work for me, unfortunately I have to explicitly pass the RUSTFLAGS="--cfg tokio_unstable"
when trying to run or build the client.
I assume it's because I do have a ~/.cargo/config.toml
file where I already specify some rust flags
[target.aarch64-apple-darwin]
rustflags = ["-C", "link-arg=-fuse-ld=/opt/homebrew/opt/llvm/bin/ld64.lld"]
and this might override the workspace configuration.
I confirmed this by removing my 'global' config.toml and then recompiling. This worked.
Generally I would prefer if we could enable this tokio-console
thing with an extra feature flag and have it disabled by default. Do you think this would be possible @b-yap?
…ob/24994504193?pr=517
…9314/job/24998905803?pr=517
…ob/25003758078?pr=517
…ob/25007476694?pr=517
…ob/25040418069?pr=517
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested it locally again, can confirm it works with the new feature flag.
Great changes, thanks a lot @b-yap 👌
@ebma Would be great if we have the vaults running with this feature. I bet it'll look very different (and interesting). |
True, but I'm not sure if we can easily connect to the clients with @zoveress in this PR we added support for a profiling tool that lets us investigate resource consumption of tasks within our process. As you can see here, it's fairly simple to run. As far as I understand, the changes we made to the client will make it expose a new endpoint (by default) on port 6669 that is used by the This is not super important but would be nice to have. |
Is the endpoint password secured? |
No it's not and I think it's only supposed to be used locally. Can we expose it to the GitLab runner only and access it from there somehow? |
This does not completely close #512, but it helps with tracking.
But if we can see which threads have been running the longest; which ones are idle for a long time, it could help determine where/when a vault gets stuck.
The
tokio-console
is described as:^ From using tokio-console, a zombie task was found.
When the vault restarts, it shuts down all running tasks except for 1:
spacewalk/clients/wallet/src/resubmissions.rs
Lines 40 to 47 in e7a672b
Aside from this, I found out that the stream also gets stuck:
spacewalk/clients/stellar-relay-lib/src/connection/connector/message_reader.rs
Line 82 in e7a672b
Requires a timeout to trigger reconnection.
How to begin the review:
I have added comments to help with the review.
tracing
feature oftokio
is necessary fortokio-console
console-subscriber
dependency is necessary fortokio-console
StellarWallet
:fn stop_periodic_resubmission_of_transactions(&mut self)
fn start_periodic_resubmission_of_transactions_from_cache(...)
:StellarWallet
'sresubmission_end_signal
field.fn stop_periodic_resubmission_of_transactions(...)
from thewallet
field ofVaultService
:fn shutdown_wallet(...)
when the service stops.READ_TIMEOUT_IN_SECS
; how long to wait on reading from the stream.tokio-console