Skip to content
This repository has been archived by the owner on Jun 14, 2024. It is now read-only.

feat(66/WAKU2-METADATA): add WakuMetadata Protocol #619

Merged
merged 7 commits into from
Oct 3, 2023

Conversation

alrevuelta
Copy link
Contributor

@alrevuelta alrevuelta commented Sep 25, 2023

@alrevuelta alrevuelta changed the title feat: add WakuMetadata RFC feat: add WakuMetadata Protocol Sep 25, 2023
@alrevuelta alrevuelta changed the title feat: add WakuMetadata Protocol feat(66/WAKU2-METADATA): add WakuMetadata Protocol Sep 25, 2023
Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a couple of comments. I think it's worth considering this closer to a negotiation protocol (similar to https://github.com/libp2p/specs/blob/master/connections/README.md#multistream-select) where the requester includes a set of desired capabilities and the responder responds with the subset of these that it supports. This will make the protocol useful for any further metadata negotiations (such as store capabilities, etc.), but also makes it easier for both sides to know that the connection is/is not interesting to them.

## Response

```proto
message WakuMessageResponse {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
message WakuMessageResponse {
message WakuMetadataResponse {

Not sure if typo or if there is some other intention that I'm missing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed d381ee1


```proto
message WakuMessageResponse {
uint64 network_id = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this:

Suggested change
uint64 network_id = 1;
optional uint64 network_id = 1;

Note that optional in the protobuf doesn't necessarily mean that this field won't be mandatory on protocol level under certain circumstances. E.g. if this is a general metadata req-resp protocol it may not be necessary to include all information in every exchange.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say the networkId will be always present.


```proto
message WakuMessageResponse {
uint64 network_id = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not foresee possibility to support more than one network_id here? If for some reason I want my node to support several test networks I may want a repeated here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want the node to support multiple networks? Wouldn't it be more sensible to have a node per network (it would definitely make implementation less complex considering things like RLN)

Copy link
Contributor Author

@alrevuelta alrevuelta Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say one node, one network. It simplifies many things, reduces complexity, edges cases, etc. Running two networks is as easy as spining up another identical waku node changing the network-id flag. + maybe its a super niche thing?

## Response

```proto
message WakuMessageResponse {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to include repeated supported shard numbers here? As a peer, I would still need this information to know if a node is interesting (i.e. can relay/store messages in the shards I'm subscribed to) or is the idea to reduce the set of possible connections sufficiently here and get this information elsewhere (e.g. discovery, hello protocol), etc.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep good point

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed d381ee1

content/docs/rfcs/66/README.md Show resolved Hide resolved
```proto
message WakuMetadataRequest {
uint32 network_id = 1;
repeated unt32 shards = 2;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jm-clius I'm not fully aware on the implications of static sharding in here. Any thought on wether it should be considered here? Or these shards should be only for auto sharding?

In other words. If a node is running with static sharding, should the metadata protocol be able to "advertise" those shards?

Another way could be to use string instead of uint32 and just put the whole topic there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it stands, this would work for static sharding as well (as the cluster ID and shard numbers are encoded exactly the same). The one outstanding thing to decide is between network_id vs. cluster_id as terminology. I think there are OK arguments either side, but we should pick one and stick with it. Perhaps let's just put it up for a vote somewhere?

Interesting idea re string, but I see this exchange as more valuable (at least for now) for the numbered sharding as in the Waku Network. Exchanging the whole topic could as well just reuse the hello messages that already exist in some libp2p implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one outstanding thing to decide is between network_id vs. cluster_id as terminology

sure. Will ask.

Interesting idea re string, but I see this exchange as more valuable (at least for now) for the numbered sharding as in the Waku Network

Not sure i follow. If shard is just an int, how do we know if that refers to a shard or to a static "shard"?

Exchanging the whole topic could as well just reuse the hello messages that already exist in some libp2p implementations.

mm why then using this metadata protocol for the shards? Perhaps it makes more sense to use the hello message and the whole topics?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure i follow. If shard is just an int, how do we know if that refers to a shard or to a static "shard"?

The network ID would be different. Static shards (despite the different name) follows the same pattern - the shard numbers are fully qualified only in conjunction with the network ID. In other words, the static sharded networks are just Waku Networks using their own network IDs (!= 1). In fact, network IDs 0-15 are (potentially) reserved so we have some scope to define our own main/test networks, but any other network ID can be used by applications interested in creating their own Waku Networks.

mm why then using this metadata protocol for the shards?

Well, why use it for network ID if that is also encoded in the topic? My understanding was that the intention is to conveniently exchange the minimum information that is necessary to determine if a connection is interesting or not - that would be network ID in conjunction with shard numbers. Neither of these pieces of metadata is currently enough on its own to make any final conclusions about a connection. The reason not to simply have this be implicit in a list of subscribed topics is to be explicit about our structured approach to network ID and shard numbers in topic construction. Or am I missing something behind the intention of the metadata protocol?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, why use it for network ID if that is also encoded in the topic?

mm because there are some protocols that don't know about "topics". I mean, if i dial with store protocol, how do i know that the store protocol is running in the same network?. afaik right now there is no way?

but yeah, agree with "exchange the minimum information to determine if a connection is interesting or not". But networkid is one level up than shards. Different network id is a deal breaker. Different shard(s), well guess we are not interested, but not as "strong".

Copy link
Contributor

@jm-clius jm-clius Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, if i dial with store protocol, how do i know that the store protocol is running in the same network?

True, but again I would need to know what shards within the network the store protocol supports. In all (current) cases, the network ID is just a logical grouping of shards. Since the shards (in fact the pubsub topics constructed from these) define how messages are routed, this is the minimum information I would need to know if a node is interesting for relay, store, filter and lightpush. (A future exception would be that network-ID defines the RLN contract regardless of shard, although this wouldn't affect my decision about which nodes I connect to or not. But RLN contract is the only protocol element that I can think of that belongs to a network ID and not to specific shards within the network.)

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. It's not quite yet stated as a negotation protocol, but this is fine as it can grow into one. I have added some more comments on the proto definition (using optional allow us to have nullable fields, which allows for much more flexible interpretation). And then this spec uses network_id where all other RFCs are currently on cluster_id - perhaps just voting on this somewhere (since the offsite there's been some arguments against "network ID" and for "cluster ID"). Ultimately the terminology is either or, as long as we pick one and stick to it.

```proto
message WakuMetadataRequest {
uint32 network_id = 1;
repeated unt32 shards = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
repeated unt32 shards = 2;
repeated uint32 shards = 2;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed 2856707

```proto
message WakuMetadataResponse {
uint32 network_id = 1;
repeated unt32 shards = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
repeated unt32 shards = 2;
repeated uint32 shards = 2;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed 2856707


```proto
message WakuMetadataResponse {
uint32 network_id = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still make this optional in the proto3 definition (both in request and response).
This is not because populating the field is necessarily optional in the protocol, but because proto3 limitations currently imply that there will be no way for us to know if the remote party populated the field or not without the optional keyword. As it stands, if this field is not explicitly populated it will always deserialise as network ID 0, whether that was the intention or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be clear already, but just thought of a better way to state the implication. If we have a statement in an RFC such as:

The network_id MUST be populated with the configured network ID...

there is no way for us to check this stipulation without the optional keyword. A bit silly in proto3 IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed 2856707

@alrevuelta alrevuelta requested review from SionoiS and jm-clius October 3, 2023 15:18
@alrevuelta alrevuelta marked this pull request as ready for review October 3, 2023 15:26
Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants