From 3214e90cb3ddfe28c2c96802f36d11a7c3c729b8 Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Tue, 22 Oct 2019 12:29:10 +0100 Subject: [PATCH 01/20] MSC2326: Label based filtering --- proposals/2326-label-based-filtering.md | 92 +++++++++++++++++++++++++ 1 file changed, 92 insertions(+) create mode 100644 proposals/2326-label-based-filtering.md diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md new file mode 100644 index 00000000000..f71093c0f7a --- /dev/null +++ b/proposals/2326-label-based-filtering.md @@ -0,0 +1,92 @@ +# Label based filtering + +## Problem + +Rooms often contain overlapping conversations, which Matrix should help users +navigate. + +## Context + +We already have the concept of 'Replies' to define which messages are +responses to which, which MSC1849 proposes extending into a generic mechanism +for defining threads which could (in future) be paginated both depth-wise and +breadth-wise. Meanwhile, MSC1198 is an alternate proposal for threading, +which separates conversations into high-level "swim lanes" with a new `POST +/rooms/{roomId}/thread` API. + +However, fully generic threading (which could be used to implement forum or +email style semantics) runs a risk of being overly complicated to specify and +implement and could result in feature creep. This is doubly true if you try +to implement retrospective threading (e.g. to allow moderators to split off +messages into their own thread, as you might do in a forum or to help manage +conversation in a busy chatroom). + +Therefore, this is a simpler proposal to allow messages in a room to be +filtered based on a given label in order to give basic one-layer-deep +threading functionality. + +## Proposal + +We let users specify an optional `m.label` field onto events (outside of E2E +contents) which provides a list of freeform text labels for the events they +send. Clients can use these to filter the overlapping conversations in a room +into different topics. The labels could also be used when bridging as a +hashtag to help manage the disconnect which can happen when bridging a +threaded room to an unthreaded one. + +Example: + +```json +{ + "type": "m.room.message", + "content": { + "body": "who wants to go down the pub?", + "msgtype": "m.text", + "m.label": [ "#fun" ] + } +} +``` + +```json +{ + "type": "m.room.encrypted", + "content": { + "algorithm": "m.megolm.v1.aes-sha2", + "ciphertext": "AwgAEpABm6.......", + "device_id": "SOLZHNGTZT", + "sender_key": "FRlkQA1enABuOH4xipzJJ/oD8fxiQHj6jrAyyrvzSTY", + "session_id": "JPWczbhnAivenK3qRwqLLBQu4W13fz1lqQpXDlpZzCg", + "m.label": [ "#work" ] + }, +} +``` + +Labels which are prefixed with # are expected to be user-visible and exposed +to the user as a hashtag, letting the user filter their current room by the +various hashtags present within it. + +Clients are expected to explicitly set the label on a message if the user's +intention is to respond as part of a given labelled topic. For instance, if +the user is currently filtered to only view messages with a given label, then +new messages sent should use the same label. Similarly if the user sends a +reply to a given message, that reply should typically use the same labels as +the message being replied to. + +The convention is to use hashtag style human-visible labels prefixed with a #, +but one could also use a unique ID (e.g. thread ID bridged from another +platform) without a # prefix). + +When a user wants to filter a room to given label(s), it defines a filter for +use with /sync or /messages to limit appropriately. This is done by new +`labels` and `not_labels` fields to the `EventFilter` object, which specifies +a list of labels to include or exclude in the given filter. + +## Problems + +Do we care about internationalising hashtags? + +Too many threading APIs? + +## Unstable prefix + +Unstable implementations should hook up `org.matrix.label` rather than `m.label`. \ No newline at end of file From 8c84d7b56ad88f7a8642deee2cc11e4ab2d851c2 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Mon, 28 Oct 2019 11:28:06 +0000 Subject: [PATCH 02/20] Add links to other MSCs --- proposals/2326-label-based-filtering.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index f71093c0f7a..4c6262bb3d2 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -8,11 +8,12 @@ navigate. ## Context We already have the concept of 'Replies' to define which messages are -responses to which, which MSC1849 proposes extending into a generic mechanism -for defining threads which could (in future) be paginated both depth-wise and -breadth-wise. Meanwhile, MSC1198 is an alternate proposal for threading, -which separates conversations into high-level "swim lanes" with a new `POST -/rooms/{roomId}/thread` API. +responses to which, which [MSC1849](https://github.com/matrix-org/matrix-doc/pull/1849) +proposes extending into a generic mechanism for defining threads which could +(in future) be paginated both depth-wise and breadth-wise. Meanwhile, +[MSC1198](https://github.com/matrix-org/matrix-doc/issues/1198) is an alternate +proposal for threading, which separates conversations into high-level "swim +lanes" with a new `POST /rooms/{roomId}/thread` API. However, fully generic threading (which could be used to implement forum or email style semantics) runs a risk of being overly complicated to specify and @@ -89,4 +90,4 @@ Too many threading APIs? ## Unstable prefix -Unstable implementations should hook up `org.matrix.label` rather than `m.label`. \ No newline at end of file +Unstable implementations should hook up `org.matrix.label` rather than `m.label`. From cb0c68fbea42cf3c882f6b093313c86b617e20b9 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Mon, 28 Oct 2019 22:00:53 +0000 Subject: [PATCH 03/20] Incorporate reviews --- proposals/2326-label-based-filtering.md | 139 ++++++++++++++++++------ 1 file changed, 103 insertions(+), 36 deletions(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 4c6262bb3d2..f95d0bb48dc 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -7,35 +7,76 @@ navigate. ## Context -We already have the concept of 'Replies' to define which messages are -responses to which, which [MSC1849](https://github.com/matrix-org/matrix-doc/pull/1849) -proposes extending into a generic mechanism for defining threads which could -(in future) be paginated both depth-wise and breadth-wise. Meanwhile, +We already have the concept of 'Replies' to define which messages are responses +to which, which [MSC1849](https://github.com/matrix-org/matrix-doc/pull/1849) +proposes extending into a generic mechanism for defining threads which could (in +future) be paginated both depth-wise and breadth-wise. Meanwhile, [MSC1198](https://github.com/matrix-org/matrix-doc/issues/1198) is an alternate proposal for threading, which separates conversations into high-level "swim lanes" with a new `POST /rooms/{roomId}/thread` API. However, fully generic threading (which could be used to implement forum or email style semantics) runs a risk of being overly complicated to specify and -implement and could result in feature creep. This is doubly true if you try -to implement retrospective threading (e.g. to allow moderators to split off +implement and could result in feature creep. This is doubly true if you try to +implement retrospective threading (e.g. to allow moderators to split off messages into their own thread, as you might do in a forum or to help manage conversation in a busy chatroom). -Therefore, this is a simpler proposal to allow messages in a room to be -filtered based on a given label in order to give basic one-layer-deep -threading functionality. +Therefore, this is a simpler proposal to allow messages in a room to be filtered +based on a given label in order to give basic one-layer-deep threading +functionality. ## Proposal -We let users specify an optional `m.label` field onto events (outside of E2E -contents) which provides a list of freeform text labels for the events they -send. Clients can use these to filter the overlapping conversations in a room -into different topics. The labels could also be used when bridging as a -hashtag to help manage the disconnect which can happen when bridging a -threaded room to an unthreaded one. +We let users specify an optional `m.labels` field onto the events. This field +maps key strings to freeform text labels: -Example: +```json +{ + // ... + "m.labels": { + "somekey": "somelabel" + } +} +``` + +Labels which are prefixed with # are expected to be user-visible and exposed to +the user by clients as a hashtag, letting the user filter their current room by +the various hashtags present within it. Labels which are not prefixed with # are +expected to be hidden from the user by clients (so that they can be used as +e.g. thread IDs bridged from another platform). + +Clients can use these to filter the overlapping conversations in a room into +different topics. The labels could also be used when bridging as a hashtag to +help manage the disconnect which can happen when bridging a threaded room to an +unthreaded one. + +Clients are expected to explicitly set the label on a message if the user's +intention is to respond as part of a given labelled topic. For instance, if the +user is currently filtered to only view messages with a given label, then new +messages sent should use the same label. Similarly if the user sends a reply to +a given message, that reply should typically use the same labels as the message +being replied to. + +When a user wants to filter a room to given label(s), it defines a filter for +use with /sync or /messages to limit appropriately. This is done by new `labels` +and `not_labels` fields to the `EventFilter` object, which specifies a list of +labels to include or exclude in the given filter. + +### Encrypted rooms + +In encrypted events, the string used as the key in the map is a SHA256 hash of a +contatenation of the text label and the ID of the room the event is being sent +to. Once encrypted by the client, the resulting `m.room.encrypted` event's +content contains a `m.labels_hashes` property which is an array of these hashes. + +When filtering events based on their label(s), clients are expected to use the +hash of the label(s) to filter in or out instead of the actual label text. + +#### Example + +Consider a label `#fun` on a message sent to a room which ID is +`!someroom:example.com`. Before encryption, the message would be: ```json { @@ -43,11 +84,18 @@ Example: "content": { "body": "who wants to go down the pub?", "msgtype": "m.text", - "m.label": [ "#fun" ] + "m.labels": { + "3204de89c747346393ea5645608d79b8127f96c70943ae55730c3f13aa72f20a": "#fun" + } } } ``` +`3204de89c747346393ea5645608d79b8127f96c70943ae55730c3f13aa72f20a` is the SHA256 +hash of the string `#fun!someroom:example.com`. + +Once encrypted, the event would become: + ```json { "type": "m.room.encrypted", @@ -57,30 +105,37 @@ Example: "device_id": "SOLZHNGTZT", "sender_key": "FRlkQA1enABuOH4xipzJJ/oD8fxiQHj6jrAyyrvzSTY", "session_id": "JPWczbhnAivenK3qRwqLLBQu4W13fz1lqQpXDlpZzCg", - "m.label": [ "#work" ] - }, + "m.labels_hashes": [ + "3204de89c747346393ea5645608d79b8127f96c70943ae55730c3f13aa72f20a" + ] + } } ``` -Labels which are prefixed with # are expected to be user-visible and exposed -to the user as a hashtag, letting the user filter their current room by the -various hashtags present within it. +### Unencrypted rooms -Clients are expected to explicitly set the label on a message if the user's -intention is to respond as part of a given labelled topic. For instance, if -the user is currently filtered to only view messages with a given label, then -new messages sent should use the same label. Similarly if the user sends a -reply to a given message, that reply should typically use the same labels as -the message being replied to. +In unencrypted rooms, the string to use as a key does not matter (as this format +is only kept for consistency with events sent in encrypted rooms) and clients +are free to use any non-empty string they wish (as long as it's unique per label +in the event). -The convention is to use hashtag style human-visible labels prefixed with a #, -but one could also use a unique ID (e.g. thread ID bridged from another -platform) without a # prefix). +When filtering events based on their label(s), clients are expected to use the +actual label text instead of the string key. -When a user wants to filter a room to given label(s), it defines a filter for -use with /sync or /messages to limit appropriately. This is done by new -`labels` and `not_labels` fields to the `EventFilter` object, which specifies -a list of labels to include or exclude in the given filter. +#### Example + +```json +{ + "type": "m.room.message", + "content": { + "body": "who wants to go down the pub?", + "msgtype": "m.text", + "m.labels": { + "somekey": "#fun" + } + } +} +``` ## Problems @@ -88,6 +143,18 @@ Do we care about internationalising hashtags? Too many threading APIs? +Using hashes means that servers could be inclined to compute rainbow tables to +read labels on encrypted messages. However, since we're using the room ID as +some kind of hash, it makes it much more expensive to do because it would mean +maintaining one rainbow table for each encrypted room it's in, which would +probably make it not worth the trouble. + ## Unstable prefix -Unstable implementations should hook up `org.matrix.label` rather than `m.label`. +Unstable implementations should hook up `org.matrix.labels` rather than +`m.labels`. When defining filters, they should also use `org.matrix.labels` and +`org.matrix.not_labels` in the `EventFilter` object. + +Additionally, servers implementing this feature should advertise that they do so +by exposing a `label_based_filtering` flag in the `unstable_features` part of +the `/versions` response. From 6627b008ab0029f4dc4d4c0f991d08c685a58ef6 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Tue, 29 Oct 2019 11:13:26 +0000 Subject: [PATCH 04/20] Describe alternative solutions --- proposals/2326-label-based-filtering.md | 26 +++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index f95d0bb48dc..bf3e324a264 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -149,6 +149,32 @@ some kind of hash, it makes it much more expensive to do because it would mean maintaining one rainbow table for each encrypted room it's in, which would probably make it not worth the trouble. +## Alternative solutions + +Instead of using hashes to identify labels in encrypted messages, using random +opaque strings was also considered. Bearing in mind that we need to be able to +use the label identifiers to filter the history of the room server-side (because +we're not expecting clients to know about the whole history of the room, see my +first point above), this solution had the following downsides, all originating +from the fact that nothing would prevent 1000 clients from using each a +different identifier: + +* filtering would have serious performances issues in E2EE rooms, as the server + would need to return all events it knows about which label identifier is any + of the 1000 identifiers provided by the client, which is quite expensive to + do. + +* it would be impossible for a filtered `/message` (or `/sync`) request to + include every event matching the desired label because we can't expect a + client to know about every identifier that has been used in the whole history + of the room, or about the fact that another client might suddenly decide to + use another identifier for the same label text, and include those identifiers + in its filtered request. + +Another proposed solution would be to use peppered hashes, and to store the +pepper in the encrypted event. However, this solution would have the same +downsides as described above. + ## Unstable prefix Unstable implementations should hook up `org.matrix.labels` rather than From 32597a7d5dbd1fb0bc4b5aca15568f3dd0a0a7c0 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Tue, 29 Oct 2019 11:21:00 +0000 Subject: [PATCH 05/20] Include an unstable prefix for m.labels_hashes --- proposals/2326-label-based-filtering.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index bf3e324a264..2ab67151192 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -178,7 +178,8 @@ downsides as described above. ## Unstable prefix Unstable implementations should hook up `org.matrix.labels` rather than -`m.labels`. When defining filters, they should also use `org.matrix.labels` and +`m.labels`, and `org.matrix.labels_hashes` rather than `m.labels_hashes`. When +defining filters, they should also use `org.matrix.labels` and `org.matrix.not_labels` in the `EventFilter` object. Additionally, servers implementing this feature should advertise that they do so From 6f36f5607ff9901a859e462b875e74e23cb437ee Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Tue, 29 Oct 2019 11:41:18 +0000 Subject: [PATCH 06/20] Incorporate review --- proposals/2326-label-based-filtering.md | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 2ab67151192..1060e949b52 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -67,7 +67,17 @@ labels to include or exclude in the given filter. In encrypted events, the string used as the key in the map is a SHA256 hash of a contatenation of the text label and the ID of the room the event is being sent -to. Once encrypted by the client, the resulting `m.room.encrypted` event's +to, i.e. `label_key = SHA256(label_text + room_id)`. + +The reason behind using a hash built from the text label and the ID of the room +here instead of e.g. a random opaque string or a peppered hash is to maintain +consistency of the key without having access to the entire history of the room +or exposing the actual text of the label to the server, so that e.g. a new +client joining the room would be able to use the same key for the same label as +any other client. See the ["Alternative solutions"](#alternate-solutions) for +more information on this point. + +Once encrypted by the client, the resulting `m.room.encrypted` event's content contains a `m.labels_hashes` property which is an array of these hashes. When filtering events based on their label(s), clients are expected to use the @@ -92,7 +102,15 @@ Consider a label `#fun` on a message sent to a room which ID is ``` `3204de89c747346393ea5645608d79b8127f96c70943ae55730c3f13aa72f20a` is the SHA256 -hash of the string `#fun!someroom:example.com`. +hash of the string `#fun!someroom:example.com`. Here's an example code +(JavaScript) to compute it: + +```javascript +label_key_unhashed = "#fun" + "!someroom:example.com" +hash = crypto.createHash('sha256'); +hash.write(label_key_unhashed); +label_key = hash.digest("hex"); // 3204de89c747346393ea5645608d79b8127f96c70943ae55730c3f13aa72f20a +``` Once encrypted, the event would become: From 46d412e8c47872ad25c8b130c53aa0c9358c9d4d Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Tue, 29 Oct 2019 11:41:57 +0000 Subject: [PATCH 07/20] Fix link --- proposals/2326-label-based-filtering.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 1060e949b52..31e5fdd56b6 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -74,7 +74,7 @@ here instead of e.g. a random opaque string or a peppered hash is to maintain consistency of the key without having access to the entire history of the room or exposing the actual text of the label to the server, so that e.g. a new client joining the room would be able to use the same key for the same label as -any other client. See the ["Alternative solutions"](#alternate-solutions) for +any other client. See the ["Alternative solutions"](#alternative-solutions) for more information on this point. Once encrypted by the client, the resulting `m.room.encrypted` event's From 78c4e1606a707a3449b4a508f7b577bd95a739e3 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Tue, 29 Oct 2019 12:18:48 +0000 Subject: [PATCH 08/20] Fix copy-paste --- proposals/2326-label-based-filtering.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 31e5fdd56b6..7f0f41791e2 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -172,10 +172,9 @@ probably make it not worth the trouble. Instead of using hashes to identify labels in encrypted messages, using random opaque strings was also considered. Bearing in mind that we need to be able to use the label identifiers to filter the history of the room server-side (because -we're not expecting clients to know about the whole history of the room, see my -first point above), this solution had the following downsides, all originating -from the fact that nothing would prevent 1000 clients from using each a -different identifier: +we're not expecting clients to know about the whole history of the room), this +solution had the following downsides, all originating from the fact that nothing +would prevent 1000 clients from using each a different identifier: * filtering would have serious performances issues in E2EE rooms, as the server would need to return all events it knows about which label identifier is any From 05217cdb7fa73498c4b06525567e23444fbf050d Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Wed, 30 Oct 2019 14:20:30 +0000 Subject: [PATCH 09/20] Standardise labels format --- proposals/2326-label-based-filtering.md | 48 +++++-------------------- 1 file changed, 8 insertions(+), 40 deletions(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 7f0f41791e2..b58ae32f8ff 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -29,14 +29,12 @@ functionality. ## Proposal We let users specify an optional `m.labels` field onto the events. This field -maps key strings to freeform text labels: +lists freeform text labels: ```json { // ... - "m.labels": { - "somekey": "somelabel" - } + "m.labels": [ "somelabel" ] } ``` @@ -65,9 +63,10 @@ labels to include or exclude in the given filter. ### Encrypted rooms -In encrypted events, the string used as the key in the map is a SHA256 hash of a +In encrypted rooms, the `m.label` field of `m.room.encrypted` events contains, +for each label of the event that's being encrypted, a SHA256 hash of a contatenation of the text label and the ID of the room the event is being sent -to, i.e. `label_key = SHA256(label_text + room_id)`. +to, i.e. `hash = SHA256(label_text + room_id)`. The reason behind using a hash built from the text label and the ID of the room here instead of e.g. a random opaque string or a peppered hash is to maintain @@ -77,9 +76,6 @@ client joining the room would be able to use the same key for the same label as any other client. See the ["Alternative solutions"](#alternative-solutions) for more information on this point. -Once encrypted by the client, the resulting `m.room.encrypted` event's -content contains a `m.labels_hashes` property which is an array of these hashes. - When filtering events based on their label(s), clients are expected to use the hash of the label(s) to filter in or out instead of the actual label text. @@ -94,9 +90,7 @@ Consider a label `#fun` on a message sent to a room which ID is "content": { "body": "who wants to go down the pub?", "msgtype": "m.text", - "m.labels": { - "3204de89c747346393ea5645608d79b8127f96c70943ae55730c3f13aa72f20a": "#fun" - } + "m.labels": [ "#fun" ] } } ``` @@ -123,38 +117,13 @@ Once encrypted, the event would become: "device_id": "SOLZHNGTZT", "sender_key": "FRlkQA1enABuOH4xipzJJ/oD8fxiQHj6jrAyyrvzSTY", "session_id": "JPWczbhnAivenK3qRwqLLBQu4W13fz1lqQpXDlpZzCg", - "m.labels_hashes": [ + "m.labels": [ "3204de89c747346393ea5645608d79b8127f96c70943ae55730c3f13aa72f20a" ] } } ``` -### Unencrypted rooms - -In unencrypted rooms, the string to use as a key does not matter (as this format -is only kept for consistency with events sent in encrypted rooms) and clients -are free to use any non-empty string they wish (as long as it's unique per label -in the event). - -When filtering events based on their label(s), clients are expected to use the -actual label text instead of the string key. - -#### Example - -```json -{ - "type": "m.room.message", - "content": { - "body": "who wants to go down the pub?", - "msgtype": "m.text", - "m.labels": { - "somekey": "#fun" - } - } -} -``` - ## Problems Do we care about internationalising hashtags? @@ -195,8 +164,7 @@ downsides as described above. ## Unstable prefix Unstable implementations should hook up `org.matrix.labels` rather than -`m.labels`, and `org.matrix.labels_hashes` rather than `m.labels_hashes`. When -defining filters, they should also use `org.matrix.labels` and +`m.labels`. When defining filters, they should also use `org.matrix.labels` and `org.matrix.not_labels` in the `EventFilter` object. Additionally, servers implementing this feature should advertise that they do so From da7776ffd0af91ad253c4224523dd03c29896558 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Wed, 30 Oct 2019 14:21:09 +0000 Subject: [PATCH 10/20] Mandate case insensitivity in labels --- proposals/2326-label-based-filtering.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index b58ae32f8ff..924d94c5b0b 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -38,6 +38,9 @@ lists freeform text labels: } ``` +The labels are expected to be insensitive to case, therefore clients are +expected to lowercase them before sending them to servers. + Labels which are prefixed with # are expected to be user-visible and exposed to the user by clients as a hashtag, letting the user filter their current room by the various hashtags present within it. Labels which are not prefixed with # are From 158f11a32a5b9d58f8a79929a0e2899808b9b35e Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Fri, 1 Nov 2019 10:42:29 +0000 Subject: [PATCH 11/20] Update unstable feature flag with vendor prefix --- proposals/2326-label-based-filtering.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 924d94c5b0b..53cb2755fe4 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -171,5 +171,5 @@ Unstable implementations should hook up `org.matrix.labels` rather than `org.matrix.not_labels` in the `EventFilter` object. Additionally, servers implementing this feature should advertise that they do so -by exposing a `label_based_filtering` flag in the `unstable_features` part of -the `/versions` response. +by exposing a `org.matrix.label_based_filtering` flag in the `unstable_features` +part of the `/versions` response. From 3a8f7168030cdef4f26a3cf2d0597416f2e44e0f Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Thu, 7 Nov 2019 17:26:50 +0000 Subject: [PATCH 12/20] Add security considerations --- proposals/2326-label-based-filtering.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 924d94c5b0b..ef0ef1d56b6 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -164,6 +164,21 @@ Another proposed solution would be to use peppered hashes, and to store the pepper in the encrypted event. However, this solution would have the same downsides as described above. +## Security considerations + +The proposed solution for encrypted rooms, despite being the only one we could +think of when writing this proposal that would make filtering possible while +obscuring the labels to some level, isn't ideal as it still allows servers to +figure out labels by computing [rainbow +tables](https://en.wikipedia.org/wiki/Rainbow_table). + +Because of this, clients might want to limit the use of this feature in +encrypted rooms, for example by enabling it with an opt-in option in the +settings, or showing a warning message to the users. + +It is likely that this solution will be replaced as part of a future proposal +once a more fitting solution is found. + ## Unstable prefix Unstable implementations should hook up `org.matrix.labels` rather than From 61f1396f9cb5e75afac9d74a45777014a3d4b6e0 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Thu, 7 Nov 2019 17:39:55 +0000 Subject: [PATCH 13/20] Incorporate review --- proposals/2326-label-based-filtering.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 911bb549998..173f9f4287b 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -67,8 +67,8 @@ labels to include or exclude in the given filter. ### Encrypted rooms In encrypted rooms, the `m.label` field of `m.room.encrypted` events contains, -for each label of the event that's being encrypted, a SHA256 hash of a -contatenation of the text label and the ID of the room the event is being sent +for each label of the event that's being encrypted, a SHA256 hash of the +concatenation of the text label and the ID of the room the event is being sent to, i.e. `hash = SHA256(label_text + room_id)`. The reason behind using a hash built from the text label and the ID of the room From d1110a29e2a3a47002e6f25295eb8504f37e0f7a Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Thu, 14 Nov 2019 15:19:18 +0000 Subject: [PATCH 14/20] Extend the filtering to every endpoint supporting it --- proposals/2326-label-based-filtering.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 173f9f4287b..1c43df1db50 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -60,9 +60,10 @@ a given message, that reply should typically use the same labels as the message being replied to. When a user wants to filter a room to given label(s), it defines a filter for -use with /sync or /messages to limit appropriately. This is done by new `labels` -and `not_labels` fields to the `EventFilter` object, which specifies a list of -labels to include or exclude in the given filter. +use with `/sync`, `/context`, `/search` or `/messages` to limit appropriately. +This is done by new `labels` and `not_labels` fields to the `EventFilter` +object, which specifies a list of labels to include or exclude in the given +filter. ### Encrypted rooms From 45225af7b205a83b307d25810d3fa9cb5d1602a5 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Thu, 14 Nov 2019 16:34:15 +0000 Subject: [PATCH 15/20] Specify a maximum length for labels --- proposals/2326-label-based-filtering.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 1c43df1db50..932ecd2af64 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -39,7 +39,8 @@ lists freeform text labels: ``` The labels are expected to be insensitive to case, therefore clients are -expected to lowercase them before sending them to servers. +expected to lowercase them before sending them to servers. A label's length is +limited to a maximum of 100 characters. Labels which are prefixed with # are expected to be user-visible and exposed to the user by clients as a hashtag, letting the user filter their current room by From f32520380de65eb2d976ce054dc6fe13bb4061e3 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Thu, 14 Nov 2019 18:26:34 +0000 Subject: [PATCH 16/20] Specify interaction with edits --- proposals/2326-label-based-filtering.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 932ecd2af64..ffc88d2da8e 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -66,6 +66,12 @@ This is done by new `labels` and `not_labels` fields to the `EventFilter` object, which specifies a list of labels to include or exclude in the given filter. +Senders may edit the `m.label` fields in order to change the field associated +with an event. If an edit removes a label that was previously associated with +the original event or a past edit of it, neither the original event nor an edit +of it should be returned by the server when filtering for events with that +label. + ### Encrypted rooms In encrypted rooms, the `m.label` field of `m.room.encrypted` events contains, From a6d12498cfed2cbdbb3184ce47f4aa6d9a399256 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Mon, 18 Nov 2019 09:47:03 +0000 Subject: [PATCH 17/20] Fix editing on encrypted events --- proposals/2326-label-based-filtering.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index ffc88d2da8e..18df8bb9569 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -66,12 +66,18 @@ This is done by new `labels` and `not_labels` fields to the `EventFilter` object, which specifies a list of labels to include or exclude in the given filter. -Senders may edit the `m.label` fields in order to change the field associated +Senders may edit the `m.label` fields in order to change the labels associated with an event. If an edit removes a label that was previously associated with the original event or a past edit of it, neither the original event nor an edit of it should be returned by the server when filtering for events with that label. +When editing the list of labels associated with an encrypted event, clients must +set the updated list of labels in the `content` field of the encrypted event in +addition with the `m.new_content` field of the decrypted event's `content` +field, so that servers can update the list of labels associated with the +original event accordingly. + ### Encrypted rooms In encrypted rooms, the `m.label` field of `m.room.encrypted` events contains, From a3450a6358cb259127b52849ad2abcffb13102ea Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Mon, 18 Nov 2019 17:19:45 +0000 Subject: [PATCH 18/20] Add some clarification about edits --- proposals/2326-label-based-filtering.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 18df8bb9569..e748381c0a9 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -78,6 +78,12 @@ addition with the `m.new_content` field of the decrypted event's `content` field, so that servers can update the list of labels associated with the original event accordingly. +When sending an edit of an event that has labels attached to it, clients are +expected to provide a list of labels, even if the edit doesn't add or remove any +label from the list provided in the original event or its latest edit (in this +case, the list is the same as the one provided in the original event or its +latest edit). + ### Encrypted rooms In encrypted rooms, the `m.label` field of `m.room.encrypted` events contains, From 7a21efddd3e2bb307c423916b3bd62643842950a Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Mon, 25 Nov 2019 15:53:46 +0000 Subject: [PATCH 19/20] Spell out that hashtag-like labels can have any character in them --- proposals/2326-label-based-filtering.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index 18df8bb9569..caea04bb9c2 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -48,6 +48,11 @@ the various hashtags present within it. Labels which are not prefixed with # are expected to be hidden from the user by clients (so that they can be used as e.g. thread IDs bridged from another platform). +A label can contain any UTF-8 character, regardless of whether it starts with a +hash or not (i.e. we don't limit the hashtag to e.g. the set of allowed +characters in a Twitter hashtag, and not-hashtag-like labels also don't have +that sort of constraints). + Clients can use these to filter the overlapping conversations in a room into different topics. The labels could also be used when bridging as a hashtag to help manage the disconnect which can happen when bridging a threaded room to an From 4b7ca5297a148d109caeab144a145713051fcb14 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Mon, 25 Nov 2019 16:21:13 +0000 Subject: [PATCH 20/20] Specify how client UIs are supposed to behave --- proposals/2326-label-based-filtering.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/proposals/2326-label-based-filtering.md b/proposals/2326-label-based-filtering.md index caea04bb9c2..3d9fc2f54dc 100644 --- a/proposals/2326-label-based-filtering.md +++ b/proposals/2326-label-based-filtering.md @@ -58,8 +58,12 @@ different topics. The labels could also be used when bridging as a hashtag to help manage the disconnect which can happen when bridging a threaded room to an unthreaded one. +Clients are expected to let users add hashtag-like labels to a message before +sending it, and to display hashtag-like labels on messages to help user easily +identify the labels they can make their client filter on. + Clients are expected to explicitly set the label on a message if the user's -intention is to respond as part of a given labelled topic. For instance, if the +intention is to respond as part of a given labelled topic. For instance, if the user is currently filtered to only view messages with a given label, then new messages sent should use the same label. Similarly if the user sends a reply to a given message, that reply should typically use the same labels as the message