Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dial attempt failed: bind: address already in use #262

Closed
paralin opened this issue Jan 5, 2018 · 4 comments
Closed

dial attempt failed: bind: address already in use #262

paralin opened this issue Jan 5, 2018 · 4 comments
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@paralin
Copy link
Contributor

paralin commented Jan 5, 2018

I'm seeing this error when multiple streams are rapidly opened to a peer from different Goroutines:

dial attempt failed: <peer.ID fR8QjW> --> <peer.ID V6hTxD> dial attempt failed: dial tcp4 0.0.0.0:8180->100.96.234.187:8180: bind: address already in use

I think there may be a concurrency issue somewhere. Using go-libp2p @ 4bba0bb (latest).

@Stebalien
Copy link
Member

If this is always happening, it's a bug that has been fixed in a dependency but I haven't bubbled it up here yet (dependency conflicts that I'm currently resolving). If you're not using gx (just using go get) you shouldn't be noticing this bug.


If this only happens sometimes, make sure that the peers aren't trying to dial each other at the same time (that will result in this error, unfortunately). It happens because we enable SO_REUSEPORT to reuse the source port. That means that there can exist at most one connection between two peers. That is:

  1. Peer A initiates a dial to peer B.
  2. Peer B initiates a dial to peer A.
  3. The second dial fails because there is already a connection between those two ports.

This is, actually, something we can and should fix. To do so, we'd need to detect this error and keep retrying until either we see that the other side as succeeded in connecting to us or we have succeeded in connecting to them. However, this will probably be a bit tricky...


So, what's your precise setup. Any chance the other side is trying to dial you back?

@paralin
Copy link
Contributor Author

paralin commented Jan 5, 2018

After introducing yamux, copying the setup in IPFS (without the msmux experiment) I no longer see this, so I believe the problem exists somewhere in the default stack used by libp2p.

For some context, the libp2p stack is in use in the FACEIT matchmaking system in production.

They are trying to dial each other at approximately the same time (+/- 2 seconds). The issue only occurs between peers that dial each other this way (A contacts B at the same time as B contacting A), not between peers that dial one way (A contacts B but never B contacts A).

The reason the dial happens so close together is because Kubernetes endpoints are used as the discovery mechanism (1 node per pod mapping via internal Kubernetes networking). Kubernetes informs the peers about each others' addresses this way at exactly the same time over the watch channel.

So, the issue is as you said: when two peers dial each other simultaneously, the SO_REUSEPORT approach breaks the connections. A easy fix is to retry with a staggered backoff (which we do, and works quite nicely). It would be nice if we could find a way to do this where simultaneous dials are possible.

@qywang2012
Copy link

I also have the problem of 'bind: address already in use'. Currently we use a old version @edb6434ddf456f58fbe2538d5336435a23915bd9.
I want to update the version to 6.0.30, this problem has fixed? If not, can you give a good suggestion to deal with this problem.@Stebalien

@Stebalien
Copy link
Member

@qywang2012 what's the exact error you're seeing?

We haven't fixed the second issue I described but you may be experiencing a different issue.

@Stebalien Stebalien added the kind/bug A bug in existing code (including security flaws) label Jan 7, 2019
marten-seemann added a commit that referenced this issue Apr 21, 2022
@MarcoPolo MarcoPolo mentioned this issue Jul 7, 2022
41 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

No branches or pull requests

3 participants