-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent PeerSet service to be buffered #1718
Conversation
zebra-network/src/peer_set/set.rs
Outdated
|
||
// Get the type of a variable as a string | ||
use std::any::type_name; | ||
fn type_of<T>(_: T) -> &'static str { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will probably want to put this somewhere else if we end up keeping it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn type_of<T>(_: T) -> &'static str { | |
// replace with type_name_of_val when it stabilises | |
// https://github.com/rust-lang/rust/issues/66359 | |
fn type_of<T>(_: T) -> &'static str { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yaahc might have an idea where we want to move this, but it's ok for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think we can delete it and just check Self
, see: https://github.com/ZcashFoundation/zebra/pull/1718/files#r573338707
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good just a few minor tweaks
zebra-network/src/peer_set/set.rs
Outdated
|
||
// `PeerSet` calls `poll_ready` multiple times on its `Client` services. | ||
// `Buffer`ed polls can fill up their Buffer reservations, causing hangs. | ||
// The next `assert` will make sure the service type is never `Buffer`. #1593 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can move part of the explanation into an assert message:
// The next `assert` will make sure the service type is never `Buffer`. #1593 | |
// See #1593 for details. |
zebra-network/src/peer_set/set.rs
Outdated
// `PeerSet` calls `poll_ready` multiple times on its `Client` services. | ||
// `Buffer`ed polls can fill up their Buffer reservations, causing hangs. | ||
// The next `assert` will make sure the service type is never `Buffer`. #1593 | ||
assert!(!format!("{:?}", type_of(&service)).contains("Buffer")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert!(!format!("{:?}", type_of(&service)).contains("Buffer")); | |
assert!(!format!("{:?}", type_of(&service)).contains("Buffer"), | |
"Clients must not use tower::Buffer, because PeerSet uses poll_ready multiple times before each call, which can cause hangs with buffers" ); |
zebra-network/src/peer_set/set.rs
Outdated
@@ -423,6 +423,12 @@ where | |||
.ready_services | |||
.get_index_mut(index) | |||
.expect("preselected index must be valid"); | |||
|
|||
// `PeerSet` calls `poll_ready` multiple times on its `Client` services. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now in the assert message:
// `PeerSet` calls `poll_ready` multiple times on its `Client` services. |
zebra-network/src/peer_set/set.rs
Outdated
|
||
// Get the type of a variable as a string | ||
use std::any::type_name; | ||
fn type_of<T>(_: T) -> &'static str { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yaahc might have an idea where we want to move this, but it's ok for now.
Co-authored-by: teor <teor@riseup.net>
zebra-network/src/peer_set/set.rs
Outdated
|
||
// `Buffer`ed polls can fill up their Buffer reservations, causing hangs. | ||
// See #1593 for details. | ||
assert!(!format!("{:?}", type_of(&service)).contains("Buffer"), | ||
"Clients must not use tower::Buffer, because PeerSet uses | ||
poll_ready multiple times before each call, which can cause hangs with buffers"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The peer set type is static, so we only need to check it once.
Let's move it to the start of PeerSet::new
, and do the checks on Self
, so we don't have to use type_of
:
// `Buffer`ed polls can fill up their Buffer reservations, causing hangs. | |
// See #1593 for details. | |
assert!(!format!("{:?}", type_of(&service)).contains("Buffer"), | |
"Clients must not use tower::Buffer, because PeerSet uses | |
poll_ready multiple times before each call, which can cause hangs with buffers"); | |
// Using `poll_ready` without `call` fills up `Buffer` reservations, causing hangs. | |
// See #1593 for details. | |
assert!(!type_name::<Self>().contains("Buffer"), | |
"Clients must not use tower::Buffer, because PeerSet uses | |
`poll_ready` multiple times before each `call`, which causes buffer hangs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self
(title case) is the current type:
https://doc.rust-lang.org/std/keyword.Self.html
self
(lower case) is the current object:
https://doc.rust-lang.org/std/keyword.self.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think this will work. I added dbg!(type_name::<Self>());
as the first line in new()
and here is the type i am getting (it contains the "Buffer" string):
[zebra-network/src/peer_set/set.rs:126] type_name::<Self>() = "zebra_network::peer_set::set::PeerSet<tower::load::peak_ewma::PeakEwmaDiscover<futures_util::stream::stream::filter::Filter<futures_channel::mpsc::Receiver<core::result::Result<tower::discover::Change<std::net::addr::SocketAddr, zebra_network::peer::client::Client>, alloc::boxed::Box<dyn std::error::Error+core::marker::Sync+core::marker::Send>>>, futures_util::future::ready::Ready<bool>, zebra_network::peer_set::initialize::init<tower::load_shed::LoadShed<tower::buffer::service::Buffer<zebrad::components::inbound::Inbound, zebra_network::protocol::internal::request::Request>>>::{{closure}}::{{closure}}>>>"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, we don't need to check the outer PeerSet
type, we need to check the inner D::Service
type.
The previous code did by check using the service
returned by:
let (key, service) = self
.ready_services
.get_index_mut(index)
.expect("preselected index must be valid");
So we should be able to do:
// `Buffer`ed polls can fill up their Buffer reservations, causing hangs. | |
// See #1593 for details. | |
assert!(!format!("{:?}", type_of(&service)).contains("Buffer"), | |
"Clients must not use tower::Buffer, because PeerSet uses | |
poll_ready multiple times before each call, which can cause hangs with buffers"); | |
// Using `poll_ready` without `call` fills up `Buffer` reservations, causing hangs. | |
// See #1593 for details. | |
assert!(!type_name::<D::Service>().contains("Buffer"), | |
"Clients must not use tower::Buffer, because PeerSet uses `poll_ready` multiple times before each `call`, which causes buffer hangs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type of ready_services
is...
zebra/zebra-network/src/peer_set/set.rs
Line 89 in 1569824
ready_services: IndexMap<D::Key, D::Service>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really love the idea of using type_name
comparisons for this assertion, it seems like an indirect way to enforce the invariant we care about, and I'm worried about it breaking because the language makes no guarantees about the stability of type names.
The returned string must not be considered to be a unique identifier of a type as multiple types may map to the same type name. Similarly, there is no guarantee that all parts of a type will appear in the returned string: for example, lifetime specifiers are currently not included. In addition, the output may change between versions of the compiler.
I am still thinking of alternatives but to start I think one option that may be better is to use edit: this won't work due to parametricityClone
as a proxy for Buffer
, since Buffer
/Batch
are the only types that should introduce "clone-ness" to Services. We'd then use static_assertions
to assert that D::Service
does not implement Clone
. Now, I'm not super happy with this solution either, since it is similarly indirect, but I'm more confident in its reliability, and it doesn't introduce any runtime checks.
Lets hold off on merging this, I'm still figuring out the details but I'm talking to @hawkw right now and I think we're going to fix the issue directly so its no longer possible to misuse Buffer in a way that exhausts all the reservations, then this assertion will be unnecessary. |
Sounds good |
I've marked this PR as draft pending the |
Is the fix you are referring to the one for issue #1671 ? |
No. #1671 fixes The fixes I am referring to were made in the Since |
This PR belongs in sprint 3 because we actually did the work. |
Motivation
Multiple calls to
poll_ready()
to thePeerSet
service can cause fill ups and hangs if they are buffered.#1697
Solution
The solution of this PR is to get the name of the service type as a string and check if that contains the word "Buffer" at runtime.
This uses https://doc.rust-lang.org/std/any/fn.type_name.html and by the notes it can fail at some edge cases.
Current and accepted type for the service is:
&mut tower::load::peak_ewma::PeakEwma<zebra_network::peer::client::Client
I manually tested this by changing the string to search from
Buffer
toPeakEwma
, then started zebra and got the panic:The code in this pull request has:
Review
@teor2345 when they get the time.
Related Issues
If the solution is good enough to be merged this will close #1697
Follow Up Work
Maybe do a test case in a separated issue.