Skip to content
This repository has been archived by the owner on May 11, 2022. It is now read-only.

parallelize dialbacks #20

Merged
merged 6 commits into from
Apr 4, 2019
Merged

parallelize dialbacks #20

merged 6 commits into from
Apr 4, 2019

Conversation

vyzo
Copy link
Contributor

@vyzo vyzo commented Apr 1, 2019

Dialing multiple autonat peers sequentially unnecessarily slows down NAT autodetection.
This changes the detection logic to perform dialbacks in parallel.

cc @ZenGround0

@vyzo vyzo requested review from Stebalien and raulk April 1, 2019 09:35
@ghost ghost assigned vyzo Apr 1, 2019
@ghost ghost added the status/in-progress In progress label Apr 1, 2019
@raulk
Copy link
Member

raulk commented Apr 2, 2019

Looking 👀

private := 0
public := 0

probe := 3 - as.confidence
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do:

var probe int
if probe = 3 - as.confidence; probe > len(peers) {
    probe = len(peers)
} else if probe == 0 {
    probe = 1
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, can do if you think it's cleaner this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a problem with doing it this way: if peers = 0 and probe is also 0 (confidence = 3), then we'll have an out of bounds slice.
then again we can't possibly have no peers and confidence = 3, so it should be ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I'd rather not have to think about this when reading the code (apparent bug that is not)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll keep the original code here, as it needs no comments.

for _, pi := range peers[:probe] {
wg.Add(1)
go func(pi pstore.PeerInfo) {
as.host.Peerstore().AddAddrs(pi.ID, pi.Addrs, pstore.TempAddrTTL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: using AddAddrs is ok here because if the peerstore has a longer TTL, it won't shorten in.

autonat.go Outdated

var mx sync.Mutex
var pubaddr ma.Multiaddr
private := 0
Copy link
Member

@raulk raulk Apr 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we enclose these vars in an anonymous struct? It'll be easier to track the scope of these things, and the zero values are our friend here:

var result struct {
    sync.Mutex
    private int
    public int
    pubaddr ma.Multiaddr
}

Then below:

result.Lock()
result.private++
result.Unlock()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, that's nicer indeed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

autonat.go Outdated
log.Debugf("Dialback through %s successful; public address is %s", pi.ID.Pretty(), a.String())
mx.Lock()
public++
pubaddr = a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that different peers will report different public endpoints, and we'll overwrite them here? This could happen if we're using different network interfaces for each peer, or if we're behind a corporate NAT that uses different exit points?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only report one address, so the override is inevitable. I don't think it matters much though, as we don't currently use the reported address for anything.

as.confidence++
if public > 0 {
log.Debugf("NAT status is public")
if as.status == NATStatusPrivate {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment here saying that we're flipping our asserted NAT status?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A log comment I take it. Sure, will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

log.Debugf("NAT status is private")
if as.status == NATStatusPublic {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

} else {
log.Debugf("NAT status is unknown")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok for a single round with unresponsive peers to reset our NAT status entirely? I think we need some error tolerance here.

Copy link
Member

@raulk raulk Apr 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd decrement the confidence and retain our previous status, until confidence drops to zero and then we can switch to Unknown?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, that's reasonable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - won't flip to unknown if we have any confidence, but rather reduce the confidence and wait for the next round.
It will only become unknown again if the confidence drops to 0.

autonat.go Outdated
default:
log.Debugf("Error dialing through %s: %s", pi.ID.Pretty(), err.Error())
}
wg.Done()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: defer this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

}

wg.Wait()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to wait for all of them? That is, could we like, try N and wait for M?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ask N nodes for dial backs but walk away after M responses. It may not be worth it but it would allow us to tolerate a few bad nodes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I'm find merging this as-is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it adds much to wait for fewer nodes, other than adding complexity. At worst a bad node will timeout, which is 30s.

log.Debugf("NAT status is public")
if as.status == NATStatusPrivate {
// we are flipping our NATStatus, so confidence drops to 0
as.confidence = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Shouldn't this be a confidence of 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this to have a confidence of 0 if there is a flip, so that we get more observations in the next round.

@Stebalien Stebalien mentioned this pull request Apr 4, 2019
9 tasks
@vyzo vyzo merged commit c6b40f9 into master Apr 4, 2019
@ghost ghost removed the status/in-progress In progress label Apr 4, 2019
@vyzo vyzo deleted the feat/parallel-dialback branch April 4, 2019 07:15
willscott pushed a commit that referenced this pull request Mar 13, 2020
full close the autonat stream
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants