Skip to content

Latest commit

 

History

History
205 lines (137 loc) · 16.4 KB

104.md

File metadata and controls

205 lines (137 loc) · 16.4 KB

NIP-104

Double Ratchet E2EE Direct Messages

This NIP defines an encrypted direct messaging scheme that provides double-ratchet E2EE (end-to-end encryption) with forward secrecy & post-compromise (backward) secrecy and allows users to access messages from multiple synced devices.

Context

Currently, one-to-one direct messages (DMs) in Nostr happen via the scheme defined in NIP-04. This NIP is not recommended because, while it encrypts the content of the message, it leaks significant amounts of metadata about the parties involved in the conversation.

With the addition of NIP-44, we have an updated encryption scheme that improves some (but not all) of the metadata leakage and improves the obfuscation of message content but this NIP stops short of defining a new kind number or scheme for doing direct messages using this encryption scheme.

There have been a few separate proposals for new ways to do DMs that do not leak metadata. The most accepted (and recently merged) one is NIP-17 which combines NIP-44 encryption with NIP-59 gift-wrapping to hide the actual direct message inside another set of events to ensure that it's impossible to see who is talking to who and when messages passed between the users. This solves the metadata leakage problem and does allow some degree of deniability/repudiation but doesn't solve forward/backward secrecy. That is to say, if a user's private key (or the calculated conversation key used to encrypt messages) is compromised, the attacker will have full access to all past and future DMs sent between those users.

Why is this important?

Without proper E2EE, Nostr cannot be used as the protocol for secure direct messaging clients. While clients like Signal do a fantastic job with E2EE, they still rely on centralized servers and as a result can be shut down by a powerful (i.e. state-level) actor. The goal of Nostr is not only to protect against centralized entities censoring you and your communications, but also protect against the ability of a state-level actor to stop these sorts of services from existing in the first place. By replacing centralized servers with decentralized relays, we make it nearly impossible for a centralized actor to completely stop communications between individual users.

How does it work?

This scheme relies on incrementing (ratcheting) three sets of keys (using 2 independent ratchet mechanisms) on each message such that the two client applications can decrypt messages even if they arrive out of order or get lost and can still function if either party is offline.

This scheme is an adaptation of the Signal protocol, which itself is based on the Off the Record protocol. It uses a slightly simplified version of the X3DH key agreement protocol and the Signal Double Ratchet algorithm.

Let's look at an overview of the steps at the most abstract level. This initial discussion skips over many of the low-level cryptographic details. Go read the two articles linked above and check out the working demo app available on Github for more detail.

  1. Alice wants to have a private conversation with Bob. Her client sends an encrypted "conversation request" to Bob. To encrypt this first event, Alice's client creates a shared root key from a combination of Alice and Bob's Nostr keys, an ephemeral Nostr key pair, and a signed pre-key, which is a replaceable event found on Bob's Nostr relays. Because of the way the Diffie-Hellman exchange works, this process also authenticates Alice's identity to Bob. This conversation request event is NIP-59 gift-wrapped in order to hide as much metadata as possible. This conversation request event (kind: 443) is put inside the gift-wrap Seal event. This is a version of the X3DH key agreement protocol.
  2. Alice has now given Bob enough information that, when he comes online, his client will see the conversation request and can authenticate that it is indeed from Alice (or, at least, someone who controls Alice's private key). It will then show Bob the conversation request. If he decides to accept this conversation request his client will then, using the shared root key he will also calculate via X3DH, fetch new message events (kind: 444), derive the keys needed to decrypt those messages, and decrypt the messages that Alice has sent. Importantly, Alice DOES NOT have to wait for Bob to accept this conversation request before she can start sending messages to him. Her conversation request event plus the tags included in her message events give Bob all the information needed (e.g. the ephemeral public key used to ratchet the shared root key + additional data) in order to derive the keys needed to decrypt her messages and continue the ratchet process.
  3. The ratchet process continues in line with the Signal Double Ratchet algorithm. It's important to note that the keys involved here are all symmetrical derived keys that are deleted as the conversation progresses. This includes a mechanism that allows for processing messages that arrive out of order or don't arrive at all.

Details

X3DH Key Exchange

For Alice to start a conversation with Bob, we need to have 4 sets of keys.

  1. Alice's Nostr keypair (npub/nsec)
  2. Alice's Ephemeral keypair (one created just for this initial conversation request)
  3. Bob's Nostr keypair (npub/nsec)
  4. Bob's Prekey keypair (Published in a kind:10443 event)

By using a combination of these keys we create a shared secret key that also authenticates that Alice is who she says she is. The prekey (which should be rotated regularly) and ephemeral key (one-time use) provides some degree of replay attck protection.

The shared secret key that is output from this combination of Diffie-Hellman calculations becomes the initial root key (described below) from which all other keys are derived.

KDF Chains

The double ratchet algorithm used here depends on KDF (key derivation function) chains. If you're not familiar with what those are, the Signal docs have a great description of KDF chains, you should read it before moving on.

NB: Both the Diffie-Hellman ratchet and the sending/receiving chain ratchet use symmetric encryption, not asymmetric encryption. This is for very good reasons but is different from how most encryption so far on Nostr has been done (via asymmetric encryption).

Three Chains

This algorithm depends on keeping 3 KDF chains in sync between the two parties. Each chain's latest output key is used to derive other values.

  1. Root Chain Key (RK)
  2. Sending Chain Key (CKs)
  3. Receiving Chain Key (CKr)

Root Chain Key (Diffie-Hellman Ratchet)

The initial key for the root chain is calculated based on the X3DH key exchange described above. From that point onwards, each time the DH Ratchet is turned, a new RK and a new CKs or CKr is output. This ratchet provides post-compromise security. In other words, if an attacker gets steals one participant's CKs and CKr, they can calculate all future message keys and encrypt/decrypt future messages. Ratching the root chain breaks this by creating a new RK and a new key for each sending participant's sending and receiving chains. This ratchet is turned twice each time the active participant changes. For example, if Alice sends 1 message, Bob then sends 3, and Alice sends 1 more, each participant will have performed the DH ratchet 6 times. Twice for each change in sender so that we can calculate a new CKs and CKr.

Sending Chain & Receiving Chain

Each participant maintains one sending and one receiving chain. Alice's sending chain is Bob's receiving chain and vice versa. This is how the double ratchet algorithm can maintain message order in an asychronous setting. When either of these chains are ratcheted the output is a new CK and a message key (MK), which is used to encrypt the content of a message. In this way, an attacker that compromises a MK can only read a single message sent between the parties.

NIP-44 Encryption using Message Keys in kind:444 Events

The content field of a kind:444 event is the ciphertext of the message being sent. This content is encrypted using the same process as outlined in NIP-44 with one major difference. Instead of calculating a conversation key using the participants Nostr identity keys as the inputs, we MUST use the message key (MK) that is the output of the participants respective sending and receiving chains. This MK changes with each message sent or received, thus providing forward secrecy.

Using nostr-tools NIP-44 v2 encrypt method, you'd do the following:

import nip44 from "nostr-tools";
import { expand as hkdf_expand } from "@noble/hashes/hkdf";
import { sha256 } from "@noble/hashes/sha256";

// Use a KDF to output two 32-byte keys

// prevChainKey is the previous KDF chain key for the sending chain
const expanded = hkdf_expand(sha256, prevChainKey, "", 64);

// chainKey is the new chainkey for the sending chain
const newChainKey = expanded.subarray(0, 32);

// messageKey is the key that we'll use to encrypt/decrypt the message content
const messageKey = expanded.subarray(32);

const ciphertext = nip44.v2.encrypt(message_plaintext, messageKey);

Decrypting follows the same process to derive keys but then calls decrypt.

Group Conversations (Out of scope)

A variation of this scheme will enable private group conversations with a few important caveats/changes to the security model. This is beyond the scope of this NIP.

Syncing Conversations on Multiple Devices (Out of scope)

This scheme allows for users to register and concurrently use multiple devices to receive/send messages. The protocol ensures that each device uses separate keys and syncs independently. Read more about this concept in Signal's Sesame protocol docs. Some adaptation for using relays instead of central servers will be required and consistency expectations won't be the same (e.g. we'd use encrypted replaceable events to store some user/device state). This is beyond the scope of this NIP.

New Event Kinds

This NIP introduces three new event kinds. First, we'll look at the two that are related to the conversation itself. These event kinds MUST always be gift-wrapped as per NIP-59 and use NIP-44 encryption for the Seal (kind 13) and Gift Wrap (kind 1059) events.

This combination might seem like overkill given we're further encrypting the actual content of the messages between the users using a double-ratchet symmetrical encryption scheme but this combination significantly decreases the amount of metadata published and improves deniability/repudiation of the conversation.

Conversation Request Event

This event is sent when one user wants to start a conversation with another user. For example, if Alice wants to talk to Bob, her first message will be a conversation request event. This message allows both parties to establish a shared root key via a Diffie-Hellman key exchange and allows Alice to derive the first key in the chain to encrypt a message. An encrypted message MUST be included in the conversation request event to ensure that the first ratchet step is performed. The message can be a default message like, "Hi, I would like to connect." if no message is provided by the user.

{
    "kind": 443,
    "created_at": <unix timestamp in seconds>,
    "pubkey": <pubkey of the sender>,
    "content": <ciphertext of initial message symmetrically encyrpted using first sending chain/message key>,
    "tags": [
        ["p", <pubkey of the recipient>],
        ["prekey", <pubkey of the recipient prekey used for DH calculation>],
        ["ephemeral", <pubkey of the ephemeral key used for DH calculation>]
    ],
    "sig": "" // No signature because these are only sent inside gift wraps.
}

Double Ratchet Direct Message (DRDM) Event

The Double Ratchet Direct Message (DRDM) event is the main event kind that carries the conversation content. This event carries enough information for recievers to properly double ratchet their chains and decrypt the message content with the right symmetrical keys.

{
    "kind": 444,
    "created_at": <unix timestamp in seconds>,
    "pubkey": <pubkey of the sender>,
    "content": <ciphertext of the message symmetrically encrypted with chain/message key>,
    "tags": [
        ["p", <pubkey of the recipient>],
        ["dh_sending", <current DH sending pubkey>],
        ["current_index", <number of current message in the current message chain>],
        ["previous_length", <length of previous message chain>]
    ],
    "sig": "" // No signature because these are only sent inside gift wraps.
}

Required Tags

  • "p": References the recipient
  • "dh_sending": The ephemeral pubkey used in the current Diffie-Hellman ratchet.
  • "current_index": The index of the current message in the current message chain. For handling out of order messages.
  • "previous_length": The length of the previous message chain. For handling out of order messages.

Clients can use any other standard tags they like, for example, "a" or "e" tags to reference previous messages.

Prekey Event

The Prekey event is a simple replaceable event where a user publishes their signed prekey that is required to initiate the Diffie-Hellman key exchange to create a conversation request event. Including a signature on this event proves that the holder of the user's main nostr key (which signs the kind: 10443 event) also has control of the private key associated with the prekey.

In most cases, clients that implement this NIP will manage the creation and rotation of this key on a regular basis. It's recommended that clients do so interactively with user consent in order to avoid overwriting prekeys created by other clients. It is also recommended that clients rotate this key on a regular basis.

{
    "kind": 10443,
    "created_at": <unix timestamp in seconds>,
    "pubkey": <pubkey>,
    "content": <hex encoded pubkey of the prekey>,
    "tags": [
        ["prekey_sig", <signature generated from hex encoded pubkey of the prekey>]
    ],
    "sig": <signed normally with the user's main nostr pubkey>
}

The "prekey_sig" tag value is a Schnorr signature of the sha256 hashed value of the prekey's pubkey, signed with the prekey's private key.

const privKey = schnorr.utils.randomPrivateKey();
const pubKey = bytesToHex(schnorr.getPublicKey(privKey));

const prekeySig = bytesToHex(
  schnorr.sign(bytesToHex(sha256(utf8Encoder.encode(pubKey)), privKey))
);

const prekeyTag = ["prekey_sig", prekeySig];

Known trade-offs & open questions

  1. Messages are decrypted and stored on each device/client pairing. This means that there is no way to load and decrypt historic messages directly from relays on a different client or device. However, you can create backups of your decrypted conversations that could be loaded into another device (this is beyond the scope of this NIP).
  2. What are potential attack vectors and results? Do we leak a lot more metadata somehow? Can we publish some set of keys over time (incl. private) to increase plausible deniability?
  3. How do we stop large amounts of spam conversation request events? Since they arrive via ephemeral keys we have to decrypt the outer layer of the GW, at a minimum. We can then limit from that point based on WoT?
  4. Should we also have an inbox relays event kind that allows users to specify which relays they want used? That definitely makes it more obvious that these GW events are DMs though.

Resources