-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fully-reproducible online tracer for banking #29196
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a short look this morning, leaving some initial comments. Still need to make it through the remaining 90%.
Side note: from the looks of it, I'm guessing you and I will have a lot of conflicts in the near future as I tried splitting up BankingStage into different sub-components (DecisionMaking, Consuming, Forwarding, Committing). As I post PRs for those I'll be sure to include you so we can make coordinating easier.
appreciate quick feedback! however, please remind that I'm not forcing you a weekend code reviews.. it's just that I'm working 24/7 whenever i find some spare time. ;)
got it. although there will be some conflicts, i don't think this tracer thing is duplicating your works. if wanted, i can bite the intense git merge conflict session. :) |
92e59ea
to
23aadb1
Compare
ca94c60
to
db4732c
Compare
@apfitzge i haven't fully replied to your all comments. however, could you look again at your convenient time as i pushed numerous commits to address them? also, i put bunch of my 2 cents. ;) |
1482637
to
b76ed3d
Compare
536: Mention "Bytes" on README.md and lib.rs r=jonasbb a=ryoqun I think #277 can be more promoted. :) As I wrote the current status of serialization of const generics like bytes (solana-labs/solana#29196 (comment)), i think `"Bytes"` is best solution so far. `@jonasbb` hey, this functionary saved us. however it took a while to spot this. I think `Bytes` can be promoted. :) Co-authored-by: Ryo Onodera <ryoqun@gmail.com>
Really cool idea, I actually though we might get a similar trace by just capturing the quic network stream on a switch and stream copy it to another machine for recording. Having this as a software feature is of course way more accessible |
Also has an added benefit of somewhat isolating it from sigverify time, as we get the time packets enter banking stage after being verified 😄 |
bb256bc
to
ff05cb8
Compare
ff05cb8
to
0d2e227
Compare
4a5c9a5
to
3b25734
Compare
3b25734
to
a77b6e3
Compare
14ec1d1
to
54042b2
Compare
#[error("Integer Cast Error: {0}")] | ||
IntegerCastError(#[from] std::num::TryFromIntError), | ||
|
||
#[error("Trace directory's byte limit is too small (must be larger than {1}): {0}")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm - happy with how this ended up and thanks for all the hard work @ryoqun. Love the commit links you leave for resolving each conversation
Pull request has been modified.
947701c
to
84e9520
Compare
Problem
It's hard to reproduce/examine past banking-related performance problems which are observed only at mainnet-beta.
The underlying problem is that it's not easy to create a similar load on test/dev environments. Worse, we also don't have good visibility of actual load due to limited set of metrics, which are indirect by nature and aggregated in ad-hoc definitions.
This is especially so for banking, say unlike replay performance. That's because any discarded packets are unavailable afterwards like later investigating/benchmarking.
Lastly, the performance profile of banking isn't linear by any means as well. There's various factors affecting its throughput (account access pattern, qos, resource contentions across banking threads...)
Summary of Changes
So, let's get to bottom of this.
Namely, start to collect all of REAL trace data fed into BankingStage on production environment.
Once provided from validator-node operators, we will be able to consume these trace data later with
solana-ledger-tool simulate-leader-blocks
(not included in this pr)The obvious concerns is of any overhead. So, background trace writer thread is introduced and its work is rather simple: just writing packet blobs. Casual testing shows 1GB/s write throughput is possible.
On the other hand, any critical-code path's overhead will be creating/cloning Arcs and sending
PacketBatch
es off over crossbeam_channel. Specifically, this results in additional 100ns perVec<PacketBatch>
, which should be negligible.Also, the trace file will be discarded 14days old or more than 14G is written. So, ram/disk usage is capped.
Hard to imagine 10G link (if employed) is fully saturated under quic, even if all of three banking channels (non-vote, tpu-vote, gossip-vote) are combined.
So, i think attack are correctly mitigated as well.
Currently, banking tracing is opt-in. but planned to be promoted to opt-out.
along side the obvious packets tracing, banking start and end events are recorded as well.
Also, backport to mb is desired. the pr is prepared in that mind (I'll spin off a preparatory cleanup pr...).
Unlike all of this other draft my prs, this pr is ready for review (except the diff size, warranting the cleanup pr....) and I've polished it up to the extreme.
spin-off pr statuses/tasks
serde_with
as a gratitude: Mention "Bytes" on README.md and lib.rs jonasbb/serde_with#536) => released https://crates.io/crates/serde_with/2.2.0chrono
: Bump chrono to v0.4.23 #29709ReplayVoteSender
: Rename remaining ones to replay_vote_{sender,receiver} #29716