Cipher negotiation #4248

TerryGeng · 2020-06-05T12:33:13Z

TerryGeng
Jun 5, 2020

I looked into the process of establishing a connection in both mumble and murmur:
Immediately after TCP connection is established:

Client sends Version message.
Server receives Version message, writes version info into ServerUser, which is a structure that holds users' info.
Client sends Authenticate message.
Server verifies Authenticate message, then send
1. CryptSetup message to sync key/nonce,
2. CodecVersion message to tell the client which codec is used, and the client's side will display a warning message if it doesn't support the codec version received from the server,
3. ChannelState message,
4. UserState message,
5. ServerSync message, to tell the client its session number, the maximum bandwidth, and welcome message, etc.
6. ServerConfig message, to tell the client if HTML is allowed, the maximum message length, etc. Though I cannot figure out why it is separated from ServerSync.
7. SuggestConfig message, like if the server suggests clients to enable positional audio, etc.

So what I need to do is to find a place for the client to tell the server the supported cipher types.

As for why it is for the client, not the server, to tell his capability, the reason is backward-compatibility. If the client sends nothing, we know it is an old client.

Therefore, I suggest we go this way:

Add support_ciphers to Version message, since I consider Version message a place to announce one's capabilities, then we decide a cipher to use in the coming UDP connection and store this choice to a field in ServerUser.
Add cipher_type to CryptState message, to tell the client our decision.

I created a new enum type in our proto file

enum CipherType {
	OCB2 = 0;
}

to hold all cipher types.

Actually, the decision on where to insert the support_ciphers field is kind of arbitrary. I don't rule out the possibilities of inserting it to Authenticate message (it even has an opus field), or CryptSetup message again. I'd like to know your guys' opinion on it :).

streaps · 2020-06-05T13:47:58Z

streaps
Jun 5, 2020

OCB2 should never be indicated or negotiated explicitly. It should only be used when the cipher negotiation is not supported by the server or the client. (Edit: we don't want to suggest that clients still should use OCB2 instead of OCB3, ChaCha, AEGIS,...)

As I wrote before, I don't think ciphers should be negotiated independently from the transport. Like TLS 1.3 doesn't allow the same set of ciphers that TLS 1.2 allows.

The client tells the server the transports/ciphers it can speak, the server decides which one to use. The server is free to choose whatever is best (like the computational resources needed based on the CPU it's running on).

The client should be able to set priorities for every transport/cipher.

What happens of a negotiated transport/cipher fails? For example how to switch from UDP to WebRTC instead of a direct fallback to TCP?

And last but not least: what about e2e encryption? Should we focus on that instead of adding all kinds of ciphers to the server?

0 replies

TerryGeng · 2020-06-05T14:20:08Z

TerryGeng
Jun 5, 2020
Author

@streaps Well... I think OCB2 should be at least indicated in the cipher enum. Otherwise, it would be hard for me to manage OCB2 in the rest part of the code. A cipher "who must not be named"? :) Of course, it would not be included in the list of supported ciphers in a 1.4.0 client.

As for WebRTC, I don't know much about it. I'm not sure whether mumble would support it in near future.

About E2E encryption, I think if the user is not using a certificate that is issued by a certificate authority, we cannot avoid the middleman attack.

0 replies

streaps · 2020-06-05T14:37:15Z

streaps
Jun 5, 2020

WebRTC is already used by mumble-web. It still needs a proxy in front of the mumble server though.

If e2ee works for XMPP and Matrix, it should work for Mumble too (see also https://jitsi.org/blog/e2ee/).

0 replies

Krzmbrzl · 2020-06-05T14:54:36Z

Krzmbrzl
Jun 5, 2020
Maintainer

E2E of the complete UDP message would break positional audio (as the server must decide in which messages to include the positional data and in which not to). Furthermore it'd break future plans of adding some meta-data to the UDP packet (e.g. containing which clients received that audio packet as well). The latter could probably be solved by encrypting the message twice but that's just inefficient (imo).

0 replies

streaps · 2020-06-05T15:04:52Z

streaps
Jun 5, 2020

I see, there is no way around forking Mumble for serious non-gaming use ;).

0 replies

Johni0702 · 2020-06-05T16:14:31Z

Johni0702
Jun 5, 2020

And last but not least: what about e2e encryption? Should we focus on that instead of adding all kinds of ciphers to the server?

I don't think E2E is on topic here. That is not to say that you should not have asked.
It's just that to me it looks like this issue is looking for a solution to the we-should-get-rid-of-OCB2 problem and, while we should preferably do that in a future-proof way, I don't think we've got a good enough idea of how E2E would in general work for us to be able to consider it here.

As I wrote before, I don't think ciphers should be negotiated independently from the transport.

I very much agree with that.
That would eventually allow for e.g.

enum VoiceTransport {
	UDP_OCB2 = 0;
	UDP_OCB3 = 1;
	UDP_ChaCha = 2;
	QUIC = 3;
	WebRTC = 4;
}

instead of having to add yet another bool to the version/authenticate packet (like the current WebRTC transport proposal does).
It then however stands to reason that replying in the CryptSetup message isn't ideal (since e.g. WebRTC doesn't use it). I think the server also replies with a Version message, so sending its decision via that probably makes more sense.

I'd even go a step further and not use an enum at all (i.e. just use "udp_ocb3"). Overhead should be negligible since it's only sent once but it has the huge advantage that third-party clients/servers can add support for ciphers/transports without having to go and reserve a enum constant in the main Mumble repo (or risking collisions).
Shorter names (such as "udp_ocb3", "quic" or "webrtc") would nevertheless be reserved as otherwise there might be two incompatible implementations but e.g. the current mumble-web client could advertise the "https://github.com/Johni0702/rust-mumble-protocol/blob/797ea4a4471f9e3d3221c46483ff414c0a70e236/protos/MumbleWithWebRTC.proto" transport (or maybe even just "proposal-3561-v1" for those that already have an open ticket on the main Mumble repo) and the server (or in the WebRTC case, for now, a proxy before the server) could immediately tell the client if it supports that.

0 replies

Krzmbrzl · 2020-06-06T09:57:58Z

Krzmbrzl
Jun 6, 2020
Maintainer

Okay so here are my thoughts on this:
To my understanding the key and nonces in CryptSetup are (kinda) depedent on what transport protocol and cipher are being used. Therefore the negotiation which one should be used, should happen before that.

This leaves us with 3 options:

Include it in Version
Include it in Authenticate
Create a new message for it

I actually don't really like options 1 and 2 since these messages have a clear purpose and shouldn't contain any information in addition to that (the fact that the Authenticate message includes Codec information is already kinda weird (imo)).

Thus I'd prefer to use a new message for this purpose. I the issue at hand and #4224 could actually share a new message Capabilities or something like that. This would then contain information about the cipher / transport protocol as well as additional fields that indicate whether certain features are available or not (this can be added separately from the cipher thing though).

This could then look something like

message Capabilities {
    enum VoiceTransport {
        UDP_OCB2 = 0;
        UDP_OCB3 = 1;
        UDP_ChaCha = 2;
        QUIC = 3;
        WebRTC = 4;
        CUSTOM = 5;
    }

    repeated VoiceTransport supported_protocols;
    repeated string custom_protocol;
}

The design here tries to incorporate the suggestion of @Johni0702 to use strings in order to define the protocol. However by also supporting the enum, we can always make sure that the official / standard protocols are not defined multiple times (aka reserved) while still allowing for experimental / unofficial protocols via the generic string field.

You'll notice that I used repeated for both fields. This is in order to allow the client to announce all protocols it supports. Additionally it could order them in the preferred order (the one it wants to used most comes first). By default I'd only consider non-standard protocols (the strings) if the server doesn't want to (or can't) deal with the client's standard protocols. However the client could ask for the custom protocols to be preferred, by including VoiceTransport::CUSTOM in supported_protocols. In that case the server would first check if it can server any of the supplied custom protocols.

So the procedure would be:

Client sends the server its Capabilities message
Server receives that and figures out which protocol & cipher it wants to use
Server answers with a Capabilitiesmessage on its own where either supported_protocols or custom_protocols contains exactly one value, which is the chosen one.
Clients proceeds connect-process accordingly

I think we can insert this process right after the Version message has been exchanged. That way we can check on the client and the server whether the other party supports this new message in the first place (based on the sent Version).

If the client doesn't support this new message, the server will default to OCB2 or drop the connection (we can add a setting for that that we should make default to use OCB2 as the default which we can then change to default dropping the connection once the message has been around for a while and we can assume that most clients are using that).

If however the client's version indicates that it does support this message, but we receive the Authenticate message without having received a Capabilities message, the server will drop the connection.

As as side-note: I just copy&pasted the VoiceTransport protocol but in principle I think WebRTC and QUIC(whatever that is) should also include the cipher, just as with UDP. Or do these protocols always use the same cipher for all eternity? In that case, maybe we want to add a version tag to them in case they'll change version sometime in the future (e.g. WEBRTC_V1).

0 replies

Johni0702 · 2020-06-06T11:22:47Z

Johni0702
Jun 6, 2020

Introducing a separate Capabilities message sounds good to me especially considering that other issue.

I don't particularly like splitting supported_protocols from custom_protocols though. That kinda makes the semantics between the two non-intuitive. Like, e.g., what if you wanted to prefer one (but not all) of your custom protocols over all other protocols?
Instead, how about this:

message Capabilities {
    message VoiceTransport {
        enum Official {
            UDP_OCB2 = 0;
            UDP_OCB3 = 1;
            UDP_ChaCha = 2;
            QUIC = 3;
            WebRTC = 4;
        }
        oneof id {
            Official official;
            string custom;
        }
    }

    repeated VoiceTransport voice_transports;
}

(the extra message type is necessary because protobuf doesn't allow repeated oneof)

As as side-note: I just copy&pasted the VoiceTransport protocol but in principle I think WebRTC and QUIC(whatever that is) should also include the cipher, just as with UDP. Or do these protocols always use the same cipher for all eternity? In that case, maybe we want to add a version tag to them in case they'll change version sometime in the future (e.g. WEBRTC_V1).

QUIC is the transport protocol which HTTP/3 will be using (see also #3637). It's using TLS1.3 and that's how it chooses its cipher, though iirc QUIC's unreliable mode (which one would want to use for voice) is still in the design phase.

WebRTC uses SRTP for encryption and DTLS for the handshake to establish a shared key. So we don't get to choose here either.

I'm not sure that adding a version to them would be of any help since no version basically implies v1 anyway. It might actually be more confusing since it's not clear what the version refers to: It could either be that there's a new WebRTC standard or that the way we used the old standard has changed.
We can probably think about how to express that once we actually do need a v2 since we can always rename it (cause the name isn't encoded on the wire).

0 replies

Krzmbrzl · 2020-06-06T11:43:28Z

Krzmbrzl
Jun 6, 2020
Maintainer

That kinda makes the semantics between the two non-intuitive. Like, e.g., what if you wanted to prefer one (but not all) of your custom protocols over all other protocols?

That's true 🤔

Instead, how about this:

How does this solve the problem though? If I want to use custom protocols, I'll have to wrap the official ones into String-representations as well (which is doable of course). This however will then re-introduce the problem of potentially having a custom protocol that shadows the name of a (future) official protocol. If we have a good resolution for this problem, we could actually just go ahead and use only String-encoded values (although these always bear the potential for typos which you wouldn't have for enums).

EDIT: Okay it took me a bit, but now I understand the concept. I just thought about how I'd go and solve this problem and came up with basically the same scheme. Only then did I realize that your suggestion actually already addresses the problem.
Thus: I agree. This looks like a good solution to me 👍

I'm not sure that adding a version to them would be of any help since no version basically implies v1 anyway.

Fair enough.

0 replies

streaps · 2020-06-08T10:31:55Z

streaps
Jun 8, 2020

I think OCB2 should be at least indicated in the cipher enum. Otherwise, it would be hard for me to manage OCB2 in the rest part of the code. A cipher "who must not be named"? Of course, it would not be included in the list of supported ciphers in a 1.4.0 client.

What you are saying is that we put OCB2 in the protocol just because that way it makes it easier to handle for the Mumble code base, but it will never be sent over the wire? Wouldn't this be a design problem in the code?

0 replies

streaps · 2020-06-08T10:57:20Z

streaps
Jun 8, 2020

So the procedure would be:

Client sends the server its Capabilities message

Server receives that and figures out which protocol & cipher it wants to use

Server answers with a Capabilitiesmessage on its own where either supported_protocols or custom_protocols contains exactly one value, which is the chosen one.

Clients proceeds connect-process accordingly

I think we can insert this process right after the Version message has been exchanged. That way we can check on the client and the server whether the other party supports this new message in the first place (based on the sent Version).

This would introduce a couple of additional round-trips. It's also not clear to me how exactly the connect process changes and what you mean with 4.

Unfortunately the visual and textual description of the connect process don't match in the documentation (is CryptSetup sent directly after Version or does it wait for Authenticate?):
https://mumble-protocol.readthedocs.io/en/latest/establishing_connection.html

It's also undefined how the client (or server) must react to unknown messages.

Ideally Capabilities would be sent immediately after Version without any delay. If that's not possible (because clients or servers would close the connection or get confused on unknown messages), we know why the Version and Authenticate message is used for other stuff.

0 replies

Krzmbrzl · 2020-06-08T12:54:42Z

Krzmbrzl
Jun 8, 2020
Maintainer

What you are saying is that we put OCB2 in the protocol just because that way it makes it easier to handle for the Mumble code base, but it will never be sent over the wire? Wouldn't this be a design problem in the code?

Uhm that's not true. Older clients will still use OCB2 so this is a legitimate option to use. Besides: We found that the OCB2 vulnerabilities don't apply to Mumble, so there's no reason why we should ban it.
And even if we did: An enum should enumerate all possible values and if the legacy fallback value is OCB2, then that should be contained 🤷

It's also not clear to me how exactly the connect process changes and what you mean with 4.

With that I was referring to the fact that e.g. WebRTC apparently doesn't use the CryptSetup message so if that's the chosen protocol, you'll have to do different stuff than when the protocol is UDP_OCB2.

Unfortunately the visual and textual description of the connect process don't match in the documentation (is CryptSetup sent directly after Version or does it wait for Authenticate?):
https://mumble-protocol.readthedocs.io/en/latest/establishing_connection.html

You're right, there is a discrepancy in the docs. I'd have to check the code but as we're re-engineering the process anyways, I'm not sure this is really important for this issue 🤔

It's also undefined how the client (or server) must react to unknown messages.

"Unknown" as in "unknown protobuf message type" or as in "unknown protocol / cipher"? I'm actually not even sure whether we should state a behavior in these cases. I guess it's up to the implementer to decide whether they want to disconnect on unknown messages or just ignore them (or do something else entirely).

Ideally Capabilities would be sent immediately after Version without any delay. If that's not possible (because clients or servers would close the connection or get confused on unknown messages), we know why the Version and Authenticate message is used for other stuff.

I don't think there is a reason that should prevent this. From what I have seen in the code so far, nothing suggested that this shouldn't be possible 🤷

0 replies

streaps · 2020-06-08T14:17:03Z

streaps
Jun 8, 2020

Uhm that's not true. Older clients will still use OCB2 so this is a legitimate option to use.

For older clients the cipher negotiation doesn't apply anyway.

Besides: We found that the OCB2 vulnerabilities don't apply to Mumble, so there's no reason why we should ban it.

There is no OCB2 in any standard crypto lib and no documentation of the OCB variant Mumble uses. This can lead to the situation that people try to implement OCB2 who don't have much experience with crypto stuff. I don't know, maybe it's impossible to do an insecure implementation of OCB2.

There is also no reason for the server to prefer OCB2 over other crypto with implementations that are better reviewed and hardware accelerated.

I do think there should be no way to negotiate OCB2 with the Capabilities message. The sooner OCB2 is deprecated, the better.

"Unknown" as in "unknown protobuf message type" or as in "unknown protocol / cipher"?

Unknown protobuf message type.

I'm actually not even sure whether we should state a behavior in these cases.

If some clients (or servers) do freak out when receiving a Capabilities flag, it couldn't be sent immediately. It's important to define, if and when message type that the client might not know are allowed or if this should not happen.

My understanding of the protocol is that the version number will not change the basic message flow or define allowed or disallowed message types, but this is not specified in the protocol documentation. It is also not specified that the version number is relevant at all (for 3rd party clients).

0 replies

TerryGeng · 2020-06-08T14:34:54Z

TerryGeng
Jun 8, 2020
Author

I agree with @Johni0702. We need a way to support custom protocols. But I'm not really into the complicated way @Johni0702 has provided. Instead, I would go with

message Capabilities {
    enum VoiceProtocol {
        Custom = 0;
        UDP_OCB2 = 1;
        UDP_OCB3 = 2;
        UDP_ChaCha = 3;
        QUIC = 4;
        WebRTC = 5;
    }
    repeated VoiceProtocol voice_protocols;
    repeated string custom_protocols;
}

If a client's preference is listed as UDP_ChaCha, unofficial_xxx, UDP_OCB3, unofficial_yyy, its Capabilities Packet should contain:

voice_protocols = [UDP_ChaCha, Custom, UDP_OCB3, Custom]
custom_protocols = ['unofficial_xxx', 'unofficial_yyy']

I think this is better than designing a complicated message type. What do you think about this?

0 replies

Krzmbrzl · 2020-06-08T16:54:53Z

Krzmbrzl
Jun 8, 2020
Maintainer

I do think there should be no way to negotiate OCB2 with the Capabilities message. The sooner OCB2 is deprecated, the better.

The argument about the completeness of an enum still applies though. I think not including OCB2 would just make maintaining and reading the code harder.
We can add a comment about OCB2 being deprecated in the proto-file though.

If some clients (or servers) do freak out when receiving a Capabilities flag, it couldn't be sent immediately.

That shouldn't be the case anyways. We send it after Version has been received. That'll tell us whether the client supports the new message or not.

My understanding of the protocol is that the version number will not change the basic message flow or define allowed or disallowed message types, but this is not specified in the protocol documentation. It is also not specified that the version number is relevant at all (for 3rd party clients).

But it is. Everything that needs to know whether a feature is available currently uses that version number.

I think this is better than designing a complicated message type. What do you think about this?

Your suggestion does solve the problem with my initial draft, but in fact it is more complicated than Johni's solution as this requires knowledge on how to extract the ordering from these 2 fields. In Johni's suggestion the ordering is clear without any doubt or room for speculation / misunderstanding (imo).

0 replies

TerryGeng · 2020-06-09T00:55:29Z

TerryGeng
Jun 9, 2020
Author

@Krzmbrzl I think the order is

Protobuf's doc:

repeated: this field can be repeated any number of times (including zero) in a well-formed message. The order of the repeated values will be preserved.

:D

0 replies

Krzmbrzl · 2020-06-09T05:57:08Z

Krzmbrzl
Jun 9, 2020
Maintainer

I think the order is

Yeah. I did understand what you mean and it'd definitely work.

What I was referring to was that the ordering is not as clear as if you only have one list. If you are to document in which order you have to process the elements, in Johni's suggestion you just need

Iterate through the list front to end and use the first protocol that is supported

whereas in your case you'd have to add another info about what to do when you encounter a Custom value in the list. Thus you need to know more about how the message works to get the ordering right than when you use only a single list. That's what I meant by "more complex" :)

0 replies

TerryGeng · 2020-06-18T00:24:50Z

TerryGeng
Jun 18, 2020
Author

As mentioned in #4299, now we divided over how to design this Capabilities message.

My way is the one mentioned above (and is also the one that was proposed by @Krzmbrzl), which has been implemented in #4299:

message Capabilities {
    enum VoiceProtocol {
        Custom = 0;
        UDP_OCB2 = 1;
        UDP_OCB3 = 2;
        UDP_ChaCha = 3;
        QUIC = 4;
        WebRTC = 5;
    }
    repeated VoiceProtocol voice_protocols;
    repeated string custom_protocols;
}

The specious problem of this scheme is the order of custom protocols, but I documented that the protocols stored in custom protocols field are also sorted by the preference. The example in the document is

Example. Client's preference: UDP_ChaCha, custom_x, UDP_OCB3, custom_y
Message it sends: supported_protocols = [UDP_ChaCha, CUSTOM, UDP_OCB3, CUSTOM]
                                custom_protocols = ['custom_x', 'custom_y']

While @Johni0702 's suggestion is

message Capabilities {
    message VoiceTransport {
        enum Official {
            UDP_OCB2 = 0;
            UDP_OCB3 = 1;
            UDP_ChaCha = 2;
            QUIC = 3;
            WebRTC = 4;
        }
        oneof id {
            Official official;
            string custom;
        }
    }

    repeated VoiceTransport voice_transports;
}

Which used nested message and abuse protobuf's oneof feature. To solve the sorting problem.

IMO I'm not really into @Johni0702's solution, based on four reasons:

I don't really like the nested messages. It may make the design seemingly logically structured, but is actually unnecessarily confusing and tedious looked by other people. And the nested message design is a price to pay for using oneof, which I don't think is a bargain.
The nested message design and the use of oneof field will increase the length of the packet for every "VoiceTransport" by 1 or 2 bytes. The overhead may be neglected considering that this is a message only sent a few times during one session. But that is not the reason for increasing unnecessary overhead. Otherwise, there would be more and more unnecessary overhead in the future and may affect the overall performance at some point.
I don't think there would be a lot of 3rd clients or servers that would support custom protocols in the future. The custom protocol tilts more towards experimenting things. As a 3rd app developer, I'm not planning to support custom protocols in the future. I can't rule out the possibilities that other 3rd dev would consider support custom protocols. But do we really have to build such a monster for it? The gain is maybe some people's aesthetic preference is satisfied, but does it really worth it, given the reasons mentioned above?
I think I will just go with @Johni0702's solution if the rest of our protocol is also twisted and threefold nested. But it seems that the rest of our proto file is pretty straight-forward and doesn't involve such complex constructions. I'm just uncertain when seeing @Johni0702's solution breaking this tradition.

I think maybe @davidebeatrici @Avatat @Kissaki @streaps and @felix91gr could provide some insights about this.

0 replies

Krzmbrzl · 2020-06-18T07:30:25Z

Krzmbrzl
Jun 18, 2020
Maintainer

In regards to the nested messages: We already have that in the protocol:

mumble/src/Mumble.proto

Lines 231 to 253 in cd60ea5

    
           message BanList { 
        
           	message BanEntry { 
        
           		// Banned IP address. 
        
           		required bytes address = 1; 
        
           		// The length of the subnet mask for the ban. 
        
           		required uint32 mask = 2; 
        
           		// User name for identification purposes (does not affect the ban). 
        
           		optional string name = 3; 
        
           		// The certificate hash of the banned user. 
        
           		optional string hash = 4; 
        
           		// Reason for the ban (does not affect the ban). 
        
           		optional string reason = 5; 
        
           		// Ban start time. 
        
           		optional string start = 6; 
        
           		// Ban duration in seconds. 
        
           		optional uint32 duration = 7; 
        
           	} 
        
           	// List of ban entries currently in place. 
        
           	repeated BanEntry bans = 1; 
        
           	// True if the server should return the list, false if it should replace old 
        
           	// ban list with the one provided. 
        
           	optional bool query = 2 [default = false]; 
        
           }

and it just makes a lot of sense

The nested message design and the use of oneof field will increase the length of the packet for every "VoiceTransport" by 1 or 2 bytes.

Where did you find this information? In the docs I can only find

If you have a message with many optional fields and where at most one field will be set at the same time, you can enforce this behavior and save memory by using the oneof feature.

which more or less claims the opposite behavior...

And if it's the oneof that feels wrong to you, then we could also use

message Capabilities {
    message VoiceTransport {
        enum Official {
            UDP_OCB2 = 0;
            UDP_OCB3 = 1;
            UDP_ChaCha = 2;
            QUIC = 3;
            WebRTC = 4;
        }
        optional Official official = 1;
        optional string custom = 2;
    }

    repeated VoiceTransport voice_transports;
}

However the semantics of oneof are very clear and are exactly what we want: Only one of the 2 fields may be set for any given VoiceTransport message. Thus I think the use of oneof actually makes it perfectly clear what it is that we want.

The gain is maybe some people's aesthetic preference is satisfied, but does it really worth it, given the reasons mentioned above?

It's not about aesthetics, it's about semantics. I think this is somewhat similar to

std::string name = "Jessy");
int age = 42;
int motivation = -100;

and

Worker worker("Jessy", 42, -100);

in both cases you can work with the respective data but the semantics of these fields actually belonging together is much better expressed by encapsulating them in a Worker object.
I think the same applies to our protocol list. What we want is a list of available protocols in the order the client prefers them. So logically what we want is a single object.
In my original suggestion however we get 2 physical objects that represent this single logical object and in order to do so, we have to pull out some "magic" interconnections between the two.
This can lead to people not implementing this "magic" correctly or to people simply forgetting about the second object, since logically we only want a single one...
With the oneof solution you are forced to check which one is set and thus you can't really forget about the existence of the other one.

I don't think there would be a lot of 3rd clients or servers that would support custom protocols in the future.

Probably true, but how does this influence the design of the protocol specification (if the argument is not "don't support custom protocols at all")? 🤔

0 replies

Johni0702 · 2020-06-18T09:22:28Z

Johni0702
Jun 18, 2020

The nested message design and the use of oneof field will increase the length of the packet for every "VoiceTransport" by 1 or 2 bytes.

Where did you find this information? In the docs I can only find

If you have a message with many optional fields and where at most one field will be set at the same time, you can enforce this behavior and save memory by using the oneof feature.

which more or less claims the opposite behavior...

I think the mentioned saving of memory only refers to how the decoded packet is stored in memory.

The overhead would be from the nested message. With protobuf any receiver always has to be able to skip over unknown parts, so we need to encode the length of the embedded thing, even if we don't usually need it:
https://developers.google.com/protocol-buffers/docs/encoding#embedded_messages
That's one byte (or for large messages which here isn't the case, more than one byte).
The other byte of overhead is from the fact that we need to specify two field ids per entry (i.e. voice_transports and official), whereas the flat approach only has to specify one id per official entry (i.e. voice_protocols). With custom protocols, both implementations use the same amount of bytes because the flat approach needs two bytes to add a CUSTOM entry to its voice_protocols.

Anyhow, I think this 100% applies:

The overhead may be neglected considering that this is a message only sent a few times during one session.

But I disagree with the conclusion:

But that is not the reason for increasing unnecessary overhead. Otherwise, there would be more and more unnecessary overhead in the future and may affect the overall performance at some point.

There already is lots of unnecessary overhead in protobuf. We could save lots of bytes by using a custom binary protocol. But I think it's worth "wasting" the bytes in exchange for a protocol with less ambiguous semantics and I think the same applies here.

0 replies

TerryGeng · 2020-06-18T15:03:50Z

TerryGeng
Jun 18, 2020
Author

@Johni0702 @Krzmbrzl Thanks for your explanation, and thanks for the time and efforts you have spent on reviewing my code and ideas.

However, I just can't see why my simple implementation is ambiguous.
And in the end, our code is run by the machine. If we make anything extremely ‘fool-proof’, even we have to use things that are tedious and nested and creates more overhead, I think then we have mistaken the whole point of developing an efficient program. I think my document is clear enough, even examples are provided. Everyone who has read the document will immediately understand what is going on inside.

So far no evidence has constructed a real definite rebuttal of my idea above. And I want to express an idea: since I'm the one who is actually writing this part, I hope my voice to be respect by other people. Especially under the circumstance that the two options have no significant differences but are just aesthetically different.

That being said, I'm always willing to listen to the advice of others, given it is indeed persuasive, or a significant number of developers are welcoming the other option, so I would know it is a tradition/custom to obey. Thus I'm looking forward to other people's opinions, and thanks for your time.

0 replies

Johni0702 · 2020-06-18T16:34:12Z

Johni0702
Jun 18, 2020

Just to make sure I understand your position:
Your primary reason for preferring the flat over the nested version is that the nested one is structurally more complicated and not as easy to grasp if you're not already familiar with how protobuf does nested messages and oneof, while at the same time not having any real advantage.

So if protobuf would allow something like the following, you would totally be down for that, right?

repeated VoiceTransport|string voice_transports;

(where | means either … or … as in e.g. typescript)

If so, then we do agree on the downside of the nested part but disagree on it not having any semantic (i.e. not purely aesthetic) advantage.
In that case, I'll try to rephrase my point in the hope of convincing you that there is a real advantage.

However, I just can't see why my simple implementation is ambiguous.

Maybe ambiguous wasn't the right word to use there. I do not at all think the flat version is ambiguous with the comments. I'm not even sure if it would be without the comments: The relation between the two fields is really the only sensible one (I can think of at least).
My issue with it is more the fact that it allows for more mistakes to happen, i.e. there's more ways to construct invalid messages (mainly by not having the same number of CUSTOM as you have entries for the custom_protocols field) than with the nested version.

As a result, there's more ways to accidentally send incorrect messages and more things you need to validate when parsing such messages. And given enough opportunities, mistakes will happen.
Which is why I have a strong preference for encoding invariants in the types, so the compiler will yell at me, rather than in comments or implied, where one can forget about them.

Case in point:
How would you parse a message such as the following?

supported_protocols = [CUSTOM, CUSTOM]
custom_protocols = ['custom_x']

By the given rules, this is obviously an invalid message and the server should probably just ignore it.
But did you remember to check for that?

mumble/src/murmur/Messages.cpp

Lines 165 to 179 in b78481f

    
           int custom_index = 0; 
        
           for (int i = 0; i < msg.supported_protocols_size(); ++i){ 
        
           	auto protocol_type = msg.supported_protocols(i); 
        
           	switch (protocol_type) { 
        
           		case MumbleProto::Capabilities_VoiceProtocol_UDP_OCB2: 
        
           			uSource->qlSupportedVoiceProtocols.append(VoiceProtocol(VoiceProtocolType::UDP_OCB2)); 
        
           			break; 
        
           		case MumbleProto::Capabilities_VoiceProtocol_CUSTOM: 
        
           			uSource->qlSupportedVoiceProtocols.append(VoiceProtocol(msg.custom_protocols(custom_index))); 
        
           			++custom_index; 
        
           			break; 
        
           		default: 
        
           			continue; 
        
           	} 
        
           }

0 replies

Krzmbrzl · 2020-06-18T17:16:39Z

Krzmbrzl
Jun 18, 2020
Maintainer

My points were also more directed at 3rdparty implementations that want to implement it. They can get it wrong when doing so. And the potential to do so is higher with your suggestion than with Johni's (imo).

And of course I also think the biggest difference is not aesthetic but semantic (as Johni already explained)

0 replies

felix91gr · 2020-06-19T18:58:01Z

felix91gr
Jun 19, 2020

I'm sorry I can't provide useful insights into the crypto discussion. I think I could, but I don't have much energy available for me this week and I'm already trying to finish my thesis application... >.<

So I'll just leave a small feedback instead:

(from @TerryGeng)

If we make anything extremely ‘fool-proof’, even we have to use things that are tedious and nested and creates more overhead, I think then we have mistaken the whole point of developing an efficient program.

Regarding that overhead: Remember that this is Cpp and that modern compilers can be extremely smart. Not all of Cpp's abstractions can be considered "zero-cost" in the sense of runtime cost, but maybe it is the case that this particular set of abstractions can be optimized away by the compiler.

If performance overhead is the main issue, I would try to benchmark both solutions and see if there's a difference.

We could also look at the compiled assembly for that matter. In fact, that might be easier now that I think about it 🤔

Anyway, I digress. I think it's important to consider the compiler's role here. Nowadays we can get away with pretty good abstractions without any runtime cost. If overhead's the issue, I'd test that it is indeed before picking one or the other. :3

0 replies

Krzmbrzl · 2020-06-19T19:04:24Z

Krzmbrzl
Jun 19, 2020
Maintainer

I don't think Terry's concerns were about runtime performance. I think his main point was that it sucked to write the necessary code...

0 replies

felix91gr · 2020-06-19T19:06:53Z

felix91gr
Jun 19, 2020

Really? To me, it felt that at least part of them were: #4248 (comment)

0 replies

TerryGeng · 2021-01-23T12:43:42Z

TerryGeng
Jan 23, 2021
Author

@Krzmbrzl @Johni0702
I remember that the last time this discussion and #4299 were stalled because people can agree on how to design the VoiceTransport protocol.
Now I would like to return to the table and restart the work of #4299 since it looks like OCB2 is making our life unnecessarily harder.

What about using an array of strings instead of a combination of enumerations and strings? The main concern would be format and overhead.

I think as long as we clearly document what strings to use, like udp_chacha20, udp_ocb2, etc., we are fine. The main concern for using strings is unofficial protocols would have multiple aliases, but the string-and-enum-combined solution doesn't solve this as well.
As for the overhead, this negotiation would only happen at the beginning of each session so the overhead of strings doesn't sound to be a big problem.

0 replies

Johni0702 · 2021-01-23T13:58:59Z

Johni0702
Jan 23, 2021

What about using an array of strings instead of a combination of enumerations and strings? The main concern would be format and overhead.

I'd be fine with that. While it has the same drawbacks as using raw strings instead of enums in any other language, I don't think any of those really apply here cause you'll generally convert it into a real enum shortly after parsing anyway (and are almost guaranteed to notice any typos during testing).
This is what I wrote initially:

I'd even go a step further and not use an enum at all (i.e. just use "udp_ocb3"). Overhead should be negligible since it's only sent once but it has the huge advantage that third-party clients/servers can add support for ciphers/transports without having to go and reserve a enum constant in the main Mumble repo (or risking collisions).
Shorter names (such as "udp_ocb3", "quic" or "webrtc") would nevertheless be reserved as otherwise there might be two incompatible implementations but e.g. the current mumble-web client could advertise the "https://github.com/Johni0702/rust-mumble-protocol/blob/797ea4a4471f9e3d3221c46483ff414c0a70e236/protos/MumbleWithWebRTC.proto" transport (or maybe even just "proposal-3561-v1" for those that already have an open ticket on the main Mumble repo) and the server (or in the WebRTC case, for now, a proxy before the server) could immediately tell the client if it supports that.

The issues brought up in response:

However by also supporting the enum, we can always make sure that the official / standard protocols are not defined multiple times (aka reserved) while still allowing for experimental / unofficial protocols via the generic string field.

can imo be solved by just having a comment for the field which explicitly declares which protocol names are officially supported, that URLs may be used for custom protocols and that all other values are reserved for future use.
Really the only tangible thing we gain by using enums is typo resistance but since everyone will ignore unknown protocols anyway, you'll very quickly notice if you did typo.

5 replies

Krzmbrzl Jan 23, 2021
Maintainer

I have to say I still very much like Johni's suggestion for the nested enum/string approach using oneof. This is semantically the clearest one and I am really in the opinion that good semantics is key to having easily maintainable software.

That being said I think using raw strings can be tailored to our needs, by a simply rule: all officially supported protocols have to start with the prefix official_ (or alternatively mumble_). And these prefixes must not be used by 3rdParty applications for providing a custom protocol flavor.
This could also be reversed by saying that custom protocols have to start with e.g. external_ but I think the chance of a 3rdParty dev missing this rule and thereby accidentally screwing up the system is higher than in the case that we take up the burden of having to use the prefix.

I guess with such an approach we could solve the issue at hand. As Johni already mentioned however this will not guard against typos, which is what enums are really great at. In the code base we can of course transform the value into an enum, but imo it'd be nicer if the protocol itself was to enforce enum usage here. In the end that's the problem of a 3rdParty dev so I guess I could live without that.

Really the only tangible thing we gain by using enums is typo resistance but since everyone will ignore unknown protocols anyway, you'll very quickly notice if you did typo.

Well not necessarily. Imagine a client sending multiple possible ciphers and in this list the first one (the one the client prefers to use) has a typo but there is another entry in the list that is typed correctly and that the server supports. Given that the server will simply ignore ciphers it does not understand, it will not complain about the typo and will silently use the other entry.
This could then lead to the preferred cipher never being used as there is not necessarily an error that is emitted if there's a typo.

Johni0702 Jan 23, 2021

Really the only tangible thing we gain by using enums is typo resistance but since everyone will ignore unknown protocols anyway, you'll very quickly notice if you did typo.

Well not necessarily. Imagine a client sending multiple possible ciphers and in this list the first one (the one the client prefers to use) has a typo but there is another entry in the list that is typed correctly and that the server supports. Given that the server will simply ignore ciphers it does not understand, it will not complain about the typo and will silently use the other entry.
This could then lead to the preferred cipher never being used as there is not necessarily an error that is emitted if there's a typo.

I'm assuming that they actually test that preferred cipher after implementing it and that should be the point where they'll notice that it's not being used.

Krzmbrzl Jan 23, 2021
Maintainer

Probably likely, yes 😅

Johni0702 Jan 23, 2021

That being said I think using raw strings can be tailored to our needs, by a simply rule: all officially supported protocols have to start with the prefix official_ (or alternatively mumble_). And these prefixes must not be used by 3rdParty applications for providing a custom protocol flavor.
This could also be reversed by saying that custom protocols have to start with e.g. external_ but I think the chance of a 3rdParty dev missing this rule and thereby accidentally screwing up the system is higher than in the case that we take up the burden of having to use the prefix.

I was going to advocate for 3rd parties to use URLs only but I can see how they might not read enough of the docs to know that, so I think having a prefix for official protocols isn't too bad of a price to pay.
Another alternative for the prefix (inspired by Matrix) would be m. (short for m~~atrix~~umble). Using a dot to make it clear that it's a namespace and not just another layer like in udp_ocb3.

Krzmbrzl Jan 23, 2021
Maintainer

Using a dot is a good idea. I think I would prefer to not use non-trivial abbreviations for the sake of clarity. Using m. makes sense if you know what this is supposed to mean, but if you don't then you probably won't find that out easily 🤔
The additional overhead a few characters is probably negligible compared to using Strings instead of enums in the first place anyways (And as was also said here before: We only send this message once so it shouldn't matter all that much) 🤷

Cipher negotiation #4248

Replies: 28 comments · 5 replies

TerryGeng Jun 5, 2020 Author

Krzmbrzl Jun 5, 2020 Maintainer

Krzmbrzl Jun 6, 2020 Maintainer

Krzmbrzl Jun 6, 2020 Maintainer

Krzmbrzl Jun 8, 2020 Maintainer

TerryGeng Jun 8, 2020 Author

Krzmbrzl Jun 8, 2020 Maintainer

TerryGeng Jun 9, 2020 Author

Krzmbrzl Jun 9, 2020 Maintainer

TerryGeng Jun 18, 2020 Author

Krzmbrzl Jun 18, 2020 Maintainer

TerryGeng Jun 18, 2020 Author

Krzmbrzl Jun 18, 2020 Maintainer

Krzmbrzl Jun 19, 2020 Maintainer

TerryGeng Jan 23, 2021 Author

Krzmbrzl Jan 23, 2021 Maintainer

Krzmbrzl Jan 23, 2021 Maintainer

Krzmbrzl Jan 23, 2021 Maintainer

Replies: 28 comments 5 replies

TerryGeng
Jun 5, 2020
Author

Krzmbrzl
Jun 5, 2020
Maintainer

Krzmbrzl
Jun 6, 2020
Maintainer

Krzmbrzl
Jun 6, 2020
Maintainer

Krzmbrzl
Jun 8, 2020
Maintainer

TerryGeng
Jun 8, 2020
Author

Krzmbrzl
Jun 8, 2020
Maintainer

TerryGeng
Jun 9, 2020
Author

Krzmbrzl
Jun 9, 2020
Maintainer

TerryGeng
Jun 18, 2020
Author

Krzmbrzl
Jun 18, 2020
Maintainer

TerryGeng
Jun 18, 2020
Author

Krzmbrzl
Jun 18, 2020
Maintainer

Krzmbrzl
Jun 19, 2020
Maintainer

TerryGeng
Jan 23, 2021
Author

Krzmbrzl Jan 23, 2021
Maintainer

Krzmbrzl Jan 23, 2021
Maintainer

Krzmbrzl Jan 23, 2021
Maintainer