Skip to content
This repository has been archived by the owner on Apr 21, 2022. It is now read-only.

Buffer early tags in temporary entries #39

Merged
merged 5 commits into from
May 23, 2019
Merged

Buffer early tags in temporary entries #39

merged 5 commits into from
May 23, 2019

Conversation

raulk
Copy link
Member

@raulk raulk commented May 6, 2019

A patch I had cooked up a while ago and forgot to submit.

Occasionally protocols try to tag peers in the connection manager before the latter has learnt about the connection via the Connected callback.

This patch buffers such tags in temporary entries. Such entries are safeguarded during the grace period, and once this period has elapsed, they take precedence for pruning. Hence they don't build up over time and drift away automatically.

@ghost ghost assigned raulk May 6, 2019
@ghost ghost added the status/in-progress In progress label May 6, 2019
@raulk raulk requested review from Stebalien, whyrusleeping and vyzo and removed request for whyrusleeping May 6, 2019 14:51
Copy link
Contributor

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we do this in upsert too?

@raulk
Copy link
Member Author

raulk commented May 6, 2019

Ah yeah, too recent.

@raulk
Copy link
Member Author

raulk commented May 6, 2019

Fixed.

Copy link
Contributor

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it make sense to treat them as regular tags and not prune them.

connmgr.go Outdated Show resolved Hide resolved
connmgr.go Show resolved Hide resolved
@Stebalien
Copy link
Member

Alternatively, we could change the semantics and say that all tags persist even after closing the connection for some grace period. garbage collecting tags for disconnected peers periodically. This will likely have little overhead and may make some things more predictable.

For example, I believe it's possible for the connection manager to see a peer disconnect while a different service doesn't (if we get a simultaneous disconnect+connect).

@raulk
Copy link
Member Author

raulk commented May 6, 2019

Hm, that would be valuable in an intermittent disconnection + reconnection. However, I think it's confusing because most protocols are designed to entirely drop internal state when a peer disconnects, and reconstruct it starting from scratch when it reconnects. Or do any protocols keep a "ghost" state? Honestly I think it could cause more problems than advantages.

@raulk
Copy link
Member Author

raulk commented May 6, 2019

In fact, we have nothing in the system that will try to re-establish a lost connection automatically. The reconnection is always triggered by a protocol.

@Stebalien
Copy link
Member

My main concern is that not all services will necessarily see the disconnect. They'll see the connection drop but services usually only drop state when all connections drop. At the moment, this will leave untagged peers.

However, I agree that a grace period like that would be kind of weird. What if we never GC tags? That is, we independently track tags/connections and expect services to remove tags when they no longer need a peer. As far as I can tell, almost all of our services correctly do this anyways.

IMO, that's the least surprising way to handle this given that we assign weights to peers and not individual connections.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nits you can ignore

connmgr.go Show resolved Hide resolved
connmgr.go Show resolved Hide resolved
@vyzo
Copy link
Contributor

vyzo commented May 7, 2019

Do not merge yet, there is a race with decremented tags on upsert that can be triggered by relay code.
Specifically, we have goroutines waiting for stream closure to decrement tags for peers when the hop stream is closed in both directions.
If we happen to prune/disconnect a peer with a hop stream, then the decrement upsert will happen in a disconnected peer and result in a temporary tag with negative value.
The problem is compounded if the peer happens to reconnect before the grace period, as it will be permanently tagged negative.

Suggested change: Either don't create temporary tags on upsert or ignore tags with negative values.

vyzo
vyzo previously requested changes May 7, 2019
Copy link
Contributor

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raulk
Copy link
Member Author

raulk commented May 14, 2019

@vyzo – based on the last comment, it seems the semantics of Upsert are unfit that particular case. We should perhaps rename this operation to Update with an upsert flag on the method signature that would be false in this case -- so as to only affect existing entries.

@Stebalien
Copy link
Member

@vyzo bump.

@vyzo
Copy link
Contributor

vyzo commented May 22, 2019

I don't like the idea of renaming the Upsert method, but I guess I can modify the call sites to return zero instead of decrementing to negative when the condition is triggered.
On the other hand we can add Update with the generalized signature alongside Upsert, and that could be used instead in the decrementing.

@raulk
Copy link
Member Author

raulk commented May 22, 2019

I propose to merge this as it respects the semantics of Upsert. We might want an Update method that strictly updates the peer if and only if a previous tag exists.

@raulk raulk dismissed vyzo’s stale review May 22, 2019 18:28

Discussed out of band and decided then problem is in the Upsert semantics.

@raulk raulk requested a review from vyzo May 22, 2019 18:28
Copy link
Contributor

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have update the callsites to Upsert to not decrement a 0 tag, which resolves the pathology.
An Update method, companion to Upsert, sounds fine.

@vyzo
Copy link
Contributor

vyzo commented May 22, 2019

this will need rebase; let me know if you think a review is needed post rebase.

@raulk raulk merged commit 97d408f into master May 23, 2019
@raulk raulk deleted the feat/early-tags branch May 23, 2019 14:36
@raulk raulk mentioned this pull request May 27, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants