A UDP-based transport protocol that takes an "opinionated" approach, similar to QUIC but with a focus on providing reasonable defaults rather than many options. The goal is to have lower complexity, simplicity, and security, while still being reasonably performant.
TomTP is peer-to-peer (P2P) friendly, meaning a P2P-friendly protocol often includes easy integration for NAT traversal, such as UDP hole punching, multi-homing, where data packets can come from different source addresses. It does not have a TIME_WAIT state that could exhaust ports and it does not open a socket for each connection, thus allowing many short-lived connections.
- https://github.com/Tribler/utp4j
- https://github.com/quic-go/quic-go
- https://github.com/skywind3000/kcp (no encryption)
- https://github.com/johnsonjh/gfcp (golang version)
- https://eprints.ost.ch/id/eprint/846/
- https://eprints.ost.ch/id/eprint/879/
- https://eprints.ost.ch/id/eprint/979/
- Always encrypted (curve25519/chacha20-poly1305) - renegotiate of shared key on sequence number overflow (tdb)
- Support for streams
- 0-RTT (first request always needs to be equal or larger than its reply -> fill up to MTU)
- No perfect forward secrecy for 1st message if payload is sent in first message (request and reply)
- P2P friendly (id peers by ed25519 public key, for both sides)
- Only FIN/FINACK teardown
- Less than 2k LoC, currently at 1.8k LoC
echo "Source Code LoC"; ls -I "*_test.go" | xargs tokei; echo "Test Code LoC"; ls *_test.go | xargs tokei
Source Code LoC
===============================================================================
Language Files Lines Code Comments Blanks
===============================================================================
Go 12 2203 1773 89 341
Markdown 1 177 0 133 44
===============================================================================
Total 13 2380 1773 222 385
===============================================================================
Test Code LoC
===============================================================================
Language Files Lines Code Comments Blanks
===============================================================================
Go 5 1366 959 195 212
===============================================================================
Total 5 1366 959 195 212
===============================================================================
- Every node on the world is reachable via network in 1s. Max RTT is 2sec
- Sequence nr is 64bit -> packets in flight with 1400 bytes size for 1sec. Worst case reorder is first <-> last. Thus, what is the in-flight bandwidth that can handle worst case: 64bit is 2^64 * 1400 * 8 -> ~206 Zbit/sec
- Current fastest speed: 22.9 Pbit/sec - multimode (https://newatlas.com/telecommunications/datat-transmission-record-20x-global-internet-traffic/)
- Commercial: 402 Tbit/sec - singlemode (https://www.techspot.com/news/103584-blistering-402-tbs-fiber-optic-speeds-achieved-unlocking.html)
However, receiving window buffer is here the bottleneck, as we would need to store the unordered packets, and the receiving window size is min 1400 X 2^63. Thus, sequence number length is not the bottleneck.
Current version: 0
Available types:
- INIT_SND
- INIT_RCV
- MSG
- Header (9 bytes):
[2bit type + 6bit crypto version | pubKeyIdShortRcv 64bit XOR pubKeyIdShortSnd 64bit]
- Crypto (64 bytes):
[pubKeyIdSnd 256bit | pubKeyEpSnd 256bit]
- Filler: (min 2 bytes)
[encrypted: fill len 16bit | fill]
- Payload: (min 8 bytes)
[encrypted: payload]
- MAC(16 bytes):
[HMAC-SHA256 of the entire message]
- Header (9 bytes):
[2bit type + 6bit crypto version | pubKeyIdShortRcv 64bit XOR pubKeyIdShortSnd 64bit]
- Crypto (32 bytes):
[pubKeyEpRcv 256bit]
- Payload: (min 8 bytes)
[encrypted: payload]
- MAC(16 bytes):
[HMAC-SHA256 of the entire message]
- Header (9 bytes):
[2bit type + 6bit crypto version | pubKeyIdShortRcv 64bit XOR pubKeyIdShortSnd 64bit]
- Encrypted Header (8 bytes):
[encrypted sequence number 64bit]
- Payload: (min 8 bytes)
[encrypted: payload]
- MAC(16 bytes):
[HMAC-SHA256 of the entire message]
The length of the complete INIT_REPLY needs to be same or smaller INIT, thus we need to fill up the INIT message. The pubKeyIdShortRcv 64bit XOR pukKeyIdShortSnd 64bit identifies the connection Id (connId) for multihoming.
Similar to QUIC, TomTP uses a deterministic way to encrypt the sequence number and payload. However, TomTP uses twice chacha20poly1305.
To simplify the implementation, the header always maintains a fixed size. While protocols like QUIC optimize by squeezing the header size, this increases implementation complexity. If all similar optimizations were applied to TomTP, it could save 35 bytes per header.
-
Payload version (8bit):
- Current version: 0
-
Payload length (16bit):
-
STREAM_ID (32 bits):
Represents the stream ID.- Size: 4 bytes.
-
STREAM_FLAGS (8 bits):
- 0 bit: Set Close flag
- 1 bit: Set RCV Window
- 2 bit: Set ACK Sn
- 3 bit: Set Data
-
op. RCV_WND_SIZE (48 bits):
Size of receive window size. -
op. ACK sn (64 bits):
SN to ACK. -
op. DATA (var):
DATA
- Total Overhead for Data Packets:
41+2 bytes (for a 1400-byte packet, this results in an overhead of ~3%).
TODO:
To only send keep alive set ACK / Payload length to 0, if after 200ms no packet is scheduled to send.
No delayed Acks, acks are sent immediately
Connection context: keeps track of MIN_RTT, last 5 RTTs, SND_WND_SIZE (cwnd) Stream context: keeps track of SEQ_NR per stream, RCV_WND_SIZE (rwnd)
Connection termination, FIN is not acknowledged, sent best effort, otherwise timeout closes the connection.
There is a heartbeat every 200ms, that is a packet with data flag, but empty data if no data present.
This is the good path of creating a stream with or without data:
SND ---> MSG_INIT_DATA
(starting)
MSG_INIT_DATA -----> RCV
MSG_INIT_ACK_DATA <- RCV
(open)
SND <--- MSG_INIT_ACK_DATA
(open)
SND(starting) has a timeout of 3s, if no reply arrives, the stream is closed. (starting) -> (ended).
If RCV receives a MSG_INIT, the stream is in starting state, it sends a MSG_REP_INIT in any case. After the stream is open (open)
If SND receives MSG_REP_INIT, then the stream is set to open (starting) -> (open)
SND can mark the stream as closed right after MSG_INIT, but the flag is send out with the 2nd packet, after MSG_REP_INIT was received. Not before. If a timeout happend, no packet is being sent
If only one message should be sent, then the first msg contains the closed flag. RCV sends the reply, RCV goes into the state (ended). if SND receives MSG_REP_INIT, the state is in the state (ended)
(open)
SND ---> MSG
(open)
MSG -----> RCV
MSG_ACK <- RCV
SND <--- MSG_ACK
Every message needs to be acked unless its a MSG packet with no data, only ACK