Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ping: Make ping fault tolerant wrt outbound substreams races #133

Merged
merged 1 commit into from
May 30, 2024

Conversation

lexnv
Copy link
Collaborator

@lexnv lexnv commented May 29, 2024

We have discovered a race between the TransportService and Protocol interface.

This issue was first seen on the identify protocol:

After looking a bit closely into it, indeed there might be a chance for the following to happen:

  • T0: Transport service sends TransportEvent::ConnectionEstablished and TransportEvent::SubstreamOpened
  • T1: The peer disconnects
  • T2: Ping protocol receives an TransportEvent::ConnectionEstablished event for a disconnected peer
  • T3: Ping::on_connection_established fails to open a substream and pending_open is never populated
  • T4: Ping protocol receives an TransportEvent::SubstreamOpened event which would have panicked litep2p

This PR replaces the todo! in case of the race with a warning, and we'll no longer panic.

It is still good to capture those as warnings for visibility, even if there's no implied side-effect on our code.

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
@lexnv lexnv changed the title ping: Make ping fault tolerant wrt outbound substreams ping: Make ping fault tolerant wrt outbound substreams races May 29, 2024
Copy link
Collaborator

@dmitry-markin dmitry-markin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Definitely logging is better then panicking here.

@lexnv lexnv merged commit ec31635 into master May 30, 2024
8 checks passed
@lexnv lexnv deleted the lexnv/ping-fault-tolerant branch May 30, 2024 10:19
dmitry-markin added a commit that referenced this pull request Jun 14, 2024
## [0.6.0] - 2024-06-14    
    
This release introduces breaking changes into `kad` module. The API has
been extended as following:
    
- An event `KademliaEvent::IncomingRecord` has been added.    
- New methods `KademliaHandle::store_record()` /
`KademliaHandle::try_store_record()` have been introduced.
    
This allows implementing manual incoming DHT record validation by
configuring `Kademlia` with `IncomingRecordValidationMode::Manual`.
    
Also, it is now possible to enable `TCP_NODELAY` on sockets.    
    
Multiple refactorings to remove the code duplications and improve the
implementation robustness have been done.
    
### Added    
    
- Support manual DHT record insertion
([#135](#135))
- transport: Make `TCP_NODELAY` configurable
([#146](#146))
    
### Changed    
    
- transport: Introduce common listener for tcp and websocket
([#147](#147))
- transport/common: Share DNS lookups between TCP and WebSocket
([#151](#151))
    
### Fixed    
    
- ping: Make ping fault tolerant wrt outbound substreams races
([#133](#133))
- crypto/noise: Make noise fault tolerant
([#142](#142))
- protocol/notif: Fix panic on missing peer state
([#143](#143))
- transport: Fix erroneous handling of secondary connections
([#149](#149))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants