Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change sequence number in gossipsub to be linearly increasing #3453

Closed
MarcoPolo opened this issue Feb 10, 2023 · 8 comments · Fixed by #3551
Closed

Change sequence number in gossipsub to be linearly increasing #3453

MarcoPolo opened this issue Feb 10, 2023 · 8 comments · Fixed by #3551

Comments

@MarcoPolo
Copy link
Contributor

MarcoPolo commented Feb 10, 2023

Summary

Apologies, if this is hashed out somewhere else, but I couldn't find any discussion on this.

The pubsub spec says

The seqno field (optional) is a 64-bit big-endian uint that is a linearly increasing number that is unique among messages originating from each given peer

Right now the implementation in this repo uses a random number per message: https://github.com/libp2p/rust-libp2p/blob/master/protocols/gossipsub/src/lib.rs#L50

The Go implementation starts with the current nano timestamp, then increments from there: https://github.com/libp2p/go-libp2p-pubsub/blob/master/pubsub.go#L296

I don't believe the current go implementation relies on an increasing sequence number, but I could imagine an optimization where it does (e.g. instead of keeping a map of all message IDs seen, we only keep the highest ID seen and discard anything lower).

I'm guessing the reason to pick a random number is to not rely on the current time.

The risk of relying on the current time is if you reboot your node and start with a an earlier time, you may be reusing IDs. However, I'd counter that argument by saying many other things would break in this scenario (including any TLS encryption). In practice, it hasn't seemed to be a problem in the Go implementation.

@mxinden
Copy link
Member

mxinden commented Feb 13, 2023

Apologies, if this is hashed out somewhere else

I remember discussing this with @AgeManning in the past, searching through some history, in 2019:

#898 (comment)

Relevant quote by @AgeManning:

From memory, the go-implementation uses sequential number and rust has chosen to use randomize them. I am indifferent about which way to go, but maintained the randomness and the note for consistency in the rust implementation.

@AgeManning are you still of the same opinion? Maybe @divagant-martian do you have thoughts on this? I am indifferent as well.

However, I'd counter that argument by saying many other things would break in this scenario (including any TLS encryption).

@MarcoPolo can you expand on how TLS's security model breaks with clock drifts <1s?

@MarcoPolo
Copy link
Contributor Author

@MarcoPolo can you expand on how TLS's security model breaks with clock drifts <1s?

I didn't say anything about <1s. It breaks with significant clock drifts > 1hr / days. At the point where the Not Before / Not After are incorrectly invalid.

In the situation with a libp2p node, a node will be up for a bit of time (>1s) and then maybe restart. If it restarts, it will have been up for a bit so even if there's a bit of clock drift, we won't reuse IDs.

I'll also point out that WireGuard uses incrementing numbers in a similar way and hasn't had a problem in practice:

... we include a TAI64N timestamp in the first message. The server keeps track of the greatest timestamp received per client and discards packets containing timestamps less than or equal to it.

From https://www.wireguard.com/protocol/#dos-mitigation

@AgeManning
Copy link
Contributor

AgeManning commented Feb 13, 2023

There was a discussion where this was argued, but I can't find the repo or the comment.

Here are some relevant issues/PRs tho:

Here is how I remember the arguments going.

Before we had message signing, there was an attack vector, where if sequence numbers were linearly increasing, then an attacker could send a message with a target peer id and many future sequence numbers such that our node would then filter out all of the future messages from that peer as they would have been "already seen" based on the default message-id (which was/is a concatenation of the source peer-id with the sequence number).

I believe in rust-libp2p we opted for random sequence numbers to mitigate this kind of thing.

With the introduction of message signing, an attacker cannot perform this kind of attack. However, in rust-libp2p we leave message signing optional as some use cases (I'm thinking of me in particular in Ethereum consensus) we do not use message signing (although we don't use sequence numbers either :p).

So the risk here, is that currently message signing is optional. If a user were to not use message signing and keep the default message-id function, then they would be vulnerable to this kind of attack again.

We are aware that the gossipsub implementation does use increasing sequence numbers, I haven't looked if they use optional message signing tho (I assume they must because we have go ethereum clients). Perhaps they leave it up to the user to make sure they don't use the default message-id.

Have you experienced any issue with this or is this just to be more spec compliant?

I believe someone was going to make a spec PR to make the specs more accommodating in this regard.

@AgeManning
Copy link
Contributor

I just realized in one of the PRs I linked, I actually kept the wording "linearly increasing" in the specs. Woops.

@MarcoPolo
Copy link
Contributor Author

This came up again since we're adding a (non-default) validator that keeps track of the max seqno per peer: https://github.com/libp2p/go-libp2p-pubsub/pull/525/files. I think this validator will break against rust peers.

Shouldn't break anyone by default, but just another thing to keep in mind.

@vyzo
Copy link

vyzo commented Feb 28, 2023

Yeah, by not monotonically increasing, it breaks all possibility of having such a validator.

We want to use this to eliminate perpetual messages when the diameter exceeds the span of the seen message timecache.
This has been observed in practice, so we are adding it as a pluggable default validator.

@AgeManning
Copy link
Contributor

Ok, so I'm not particularly concerned about which way we go here.

We can adjust the rust implementation to use linearly increasing sequence numbers and we switch to random sequence numbers if message signing is disabled (and sequence numbers are enabled).

Does this sound like a reasonable way forward?

@vyzo
Copy link

vyzo commented Mar 1, 2023

Yes, that's perfectly fine.

@mergify mergify bot closed this as completed in #3551 Mar 14, 2023
mergify bot pushed a commit that referenced this issue Mar 14, 2023
This modifies the gossipsub implementation to use monotonically increasing sequence numbers for signed messages (as dictated by the specification). There is a discussion about this in #3453. This change will make rust-libp2p gossipsub align with the go-implementation when messages are signed.

Messages will however still use randomized sequence numbers when messages are unsigned for security reasons (as discussed in the issue linked).

This shouldn't change any user-level API, only the seqno behavior. It is fully backwards compatible.

Resolves #3453.

Pull-Request: #3551.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants