-
Notifications
You must be signed in to change notification settings - Fork 964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(swarm): gracefully disable oneshot handler on dial upgrade errors #3577
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
swarm/src/handler/one_shot.rs
Outdated
@@ -213,7 +213,9 @@ where | |||
} | |||
ConnectionEvent::DialUpgradeError(DialUpgradeError { error, .. }) => { | |||
if self.pending_error.is_none() { | |||
log::debug!("DialUpgradeError: {error:?}"); | |||
self.pending_error = Some(error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove this, otherwise we will still close the connection later!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pardon me, remove self.pending_error = Some(error)
?
I am not sure what implications are, so I kept the changes aligned with the approach outlined in the respective issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything within pending_error
will be returned via ProtocolsHandlerEvent::Close
which effectively closes the connection. In this case, we don't want that but set keep_alive
to No
which, if not other handler needs the connection will result in closing it.
swarm/src/handler/one_shot.rs
Outdated
@@ -213,7 +213,9 @@ where | |||
} | |||
ConnectionEvent::DialUpgradeError(DialUpgradeError { error, .. }) => { | |||
if self.pending_error.is_none() { | |||
log::debug!("DialUpgradeError: {error:?}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log::debug!("DialUpgradeError: {error:?}"); | |
log::debug!("DialUpgradeError: {error}"); |
Errors can be display formatted but at the moment, that does not necessarily print the entire chain of errors.
@mxinden Should we depend on anyhow
internally here to print all errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a function print_error_chain
in swarm/src/lib.rs
. I could use that one. Is this an option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to use that one yeah as we are within the same crate!
1ba4e1a
to
1060619
Compare
1060619
to
ad03390
Compare
swarm/src/handler.rs
Outdated
@@ -467,7 +467,7 @@ where | |||
ConnectionHandlerUpgrErr::Timer => { | |||
write!(f, "Timer error while opening a substream") | |||
} | |||
ConnectionHandlerUpgrErr::Upgrade(err) => write!(f, "{err}"), | |||
err @ ConnectionHandlerUpgrErr::Upgrade(_) => crate::print_error_chain(f, err), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is what we want, instead, this should probably just print "upgrade error" without printing the inner error. That is what print_error_chain
then does because the inner error is returned from source
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use print_error_chain
here, we might end up with duplicate printing if somebody uses print_error_chain
on ConnectionHandlerUpgrErr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right.
Please see my other comment.
It'd be possible to mimic impl fmt::Display for DialError
.
6ec12e4
to
8b4c290
Compare
One test is failing but this is due some docker issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Yes that is unfortunately a flaky test. Can you add a small changelog entry to |
Is it really necessary? |
I'd argue that this one is definitely a change in behaviour that is observable.
Feel free to copy the PR title (modulo formatting adjustments)! |
done |
This pull request has merge conflicts. Could you please resolve them @vnermolaev? 🙏 |
7f6fc65
to
2e468f2
Compare
Thank you! I've opened #3591 as a spiritual follow-up because we currently have this "problem" in many crates actually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work here. @thomaseizinger or @vnermolaev can either of you address the comments below?
I hope that this is not a breaking change. In other words, I hope that no downstream user depends on the fact that the OneShotHandler
closes a connection if a remote does not support its protocol.
@@ -90,6 +90,8 @@ | |||
- Deprecate methods `Swarm::with_executor`, `Swarm::with_*_executor`, `Swarm::without_executor`. | |||
Introduce similar methods in `SwarmBuilder`. See [PR 3588]. | |||
|
|||
- Gracefully disable oneshot handler on dial upgrade errors. See [PR 3577]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under the assumption that this is not a breaking change, this pull request should have bumped libp2p-swarm
to v0.42.1
and this changelog entry should have been under the v0.42.1
section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how to address this because it's been merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just fix it in a new PR.
@@ -213,7 +213,8 @@ where | |||
} | |||
ConnectionEvent::DialUpgradeError(DialUpgradeError { error, .. }) => { | |||
if self.pending_error.is_none() { | |||
self.pending_error = Some(error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, this was the only occurrence of setting self.pending_error
. In other words, I don't think pending_error
is used anywhere after this change. Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost; there is one more place
if let Some(err) = self.pending_error.take() {
return Poll::Ready(ConnectionHandlerEvent::Close(err));
}
The resolution of this usage is related to #3591.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if it is never set, then this is dead code so we can delete it straight-away!
Sorry for missing that @mxinden.
I'd be very surprised if it breaks something. I consider this a bugfix to be honest. |
Description
Resolves #3269.
Notes
This PR implement changes discussed in #3269
Links to any relevant issues
#3269
Change checklist
The changes are trivial. Not sure whether the usual checklist applies.