Skip to content

Commit

Permalink
Clarify terminology around aggregations
Browse files Browse the repository at this point in the history
I've done my best to remove the word "bundle", because I feel like it causes
more confusion than it provides. Instead I have favoured "aggregated child
events" which I think is clearer.

Some general clarification around these parts of the spec.
  • Loading branch information
richvdh committed Jan 31, 2023
1 parent d55af02 commit 3d3f28c
Show file tree
Hide file tree
Showing 7 changed files with 96 additions and 79 deletions.
1 change: 1 addition & 0 deletions changelogs/client_server/newsfragments/1424.clarification
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Clarify the sections of the specification concerning aggregation of child events.
38 changes: 18 additions & 20 deletions content/client-server-api/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -1979,7 +1979,7 @@ This specification describes the following relationship types:
* [Threads](#threading).
* [References](#reference-relations)

#### Aggregations
#### Aggregations of child events

{{% added-in v="1.3" %}}

Expand All @@ -1993,14 +1993,12 @@ of times that `key` was used by child events.

The actual aggregation format depends on the `rel_type`.

Aggregations are sometimes automatically included by a server alongside the parent
event. This is known as a "bundled aggregation" or "bundle" for simplicity. The
act of doing this is "bundling".

When an event is served to the client through the APIs listed below, a `m.relations` property
is included under `unsigned` if the event has child events which can be aggregated and point
at it. The `m.relations` property is an object keyed by `rel_type` and value being the type-specific
aggregated format for that `rel_type`, also known as the bundle.
When an event is served to the client through the APIs listed below, a
`m.relations` property is included under `unsigned` if the event has child
events which can be aggregated and point at it. The `m.relations` property is
an object keyed by `rel_type` and value being the type-specific aggregated
format for that `rel_type`. This `m.relations` property is known as a "bundled
aggregation".

For example (unimportant fields not included):

Expand Down Expand Up @@ -2036,10 +2034,10 @@ For example (unimportant fields not included):
}
```

Note how the `org.example.possible_annotations` bundle is an array compared to the
`org.example.possible_thread` bundle where the server is summarising the state of
the relationship in a single object. Both are valid ways to aggregate, and their
exact types depend on the `rel_type`.
Note how the `org.example.possible_annotations` aggregation is an array, while in the
`org.example.possible_thread` aggregation where the server is summarising the state of
the relationship in a single object. Both are valid ways to aggregate: the format of an
aggregation depends on the `rel_type`.

{{% boxes/warning %}}
State events do not currently receive bundled aggregations. This is not
Expand All @@ -2066,11 +2064,11 @@ such as `/initialSync`.

While this functionality allows the client to see what was known to the server at the
time of handling, the client should continue to aggregate locally if it is aware of
the relationship type's behaviour. For example, a client might increment a `count`
on a parent event's bundle if it saw a new child event which referenced that parent.
the relationship type's behaviour. For example, a client might internally increment a `count`
in a parent event's aggregation data if it saw a new child event which referenced that parent.

The bundle provided by the server only includes child events which were known at the
time the client would receive the bundle. For example, in a single `/sync` response
The aggregation provided by the server only includes child events which were known at the
time the client would receive the aggregation. For example, in a single `/sync` response
with the parent and multiple child events the child events would have already been
included on the parent's `m.relations` field. Events received in future syncs would
need to be aggregated manually by the client.
Expand All @@ -2080,7 +2078,7 @@ Events from [ignored users](#ignoring-users) do not appear in the aggregation
from the server, however clients might still have events from ignored users cached. Like
with normal events, clients will need to de-aggregate child events sent by ignored users to
avoid them being considered in counts. Servers must additionally ensure they do not
consider child events from ignored users when preparing a bundle for the client.
consider child events from ignored users when preparing an aggregation for the client.
{{% /boxes/note %}}

When a parent event is redacted, the child events which pointed to that parent remain, however
Expand All @@ -2089,7 +2087,7 @@ to de-aggregate or disassociate the event once the relationship is lost. Clients
aggregation or which handle redactions locally should do the same.

It is suggested that clients perform local echo on aggregations — for instance, aggregating
a new child event into a bundle optimistically until the server returns a failure or the client
a new child event into a parent event optimistically until the server returns a failure or the client
gives up on sending the event, at which point the event should be de-aggregated and an
error or similar shown. The client should be cautious to not aggregate an event twice if
it has already optimistically aggregated the event. Clients are encouraged to take this
Expand All @@ -2098,7 +2096,7 @@ likely using the transaction ID as a temporary event ID until a proper event ID

{{% boxes/warning %}}
Due to history visibility restrictions, child events might not be visible to the user
if they are in a section of history the user cannot see. This means any bundles which would
if they are in a section of history the user cannot see. This means any aggregations which would
normally include those events will be lacking them and the client will not be able to
locally aggregate the events either — relating events of importance (such as votes) should
take into consideration history visibility.
Expand Down
28 changes: 15 additions & 13 deletions content/client-server-api/modules/event_replacements.md
Original file line number Diff line number Diff line change
Expand Up @@ -197,16 +197,17 @@ replacement event.

Note that there can be multiple events with an `m.replace` relationship to a
given event (for example, if an event is edited multiple times). These should
be [aggregated](#aggregations) by the homeserver.
be [aggregated](#aggregations-of-child-events) by the homeserver.

The aggregation format of `m.replace` relationships gives the `event_id`,
`origin_server_ts`, and `sender` of the **most recent** replacement event. The
most recent event is determined by comparing `origin_server_ts`; if two or more
replacement events have identical `origin_server_ts`, the event with the
lexicographically largest `event_id` is treated as more recent.

This aggregation is bundled under the `unsigned` property as `m.relations` for any
event that is the target of an `m.replace` relationship. For example:
As with any other aggregation of child events, the `m.replace` aggregation is
included under the `m.relations` property in `unsigned` for any event that is
the target of an `m.replace` relationship. For example:

```json
{
Expand All @@ -224,22 +225,22 @@ event that is the target of an `m.replace` relationship. For example:
}
```

If the original event is
[redacted](#redactions), any
`m.replace` relationship should **not** be bundled with it (whether or not any
subsequent replacements are themselves redacted). Note that this behaviour is
specific to the `m.replace` relationship. See also [redactions of edited
However, if the original event is [redacted](#redactions), any replacement
events are *not* aggregated and `m.replace` is omitted from the aggregation
returned under `m.relations` (whether or not any subsequent replacements are
themselves redacted). Note that this behaviour is specific to the `m.replace`
relationship. See also [redactions of edited
events](#redactions-of-edited-events) below.

##### Server-side replacement of content

Whenever an `m.replace` is to be bundled with an event as above, the server
Whenever an `m.replace` is to be bundled with its parent event as above, the server
should also modify the content of the original event according to the
`m.new_content` of the most recent replacement event (determined as above).

An exception applies to [`GET /_matrix/client/v3/rooms/{roomId}/event/{eventId}`](#get_matrixclientv3roomsroomideventeventid),
which should return the unmodified event (though the relationship should still
be bundled, as described above).
which should return the unmodified event (though the replacement event should still
be included under `m.relations`, as described above).

#### Client behaviour

Expand Down Expand Up @@ -280,8 +281,9 @@ subsequent edits, from the visible timeline. In this situation, homeservers
will return an empty `content` for the original event as with any other
redacted event, and as
[above](#server-side-aggregation-of-mreplace-relationships) the replacement
events will not be bundled with the original event. Note that the subsequent edits are
not actually redacted themselves: they simply serve no purpose within the visible timeline.
events will not be included in the aggregation bundled with the original
event. Note that the subsequent edits are not actually redacted themselves:
they simply serve no purpose within the visible timeline.

#### Edits of replies

Expand Down
11 changes: 6 additions & 5 deletions content/client-server-api/modules/reference_relations.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,12 @@ messages.

##### Server-side aggregation of `m.reference`

The aggregation format of `m.reference` relations consists of a single `chunk` property,
which lists all the events which `m.reference` the event (the parent). Currently,
only a single `event_id` field is present on the events in the `chunk`.
The [aggregation](#aggregations-of-child-events) format of `m.reference`
relations consists of a single `chunk` property, which lists all the events
which `m.reference` the event (the parent). Currently, only a single `event_id`
field is present on the events in the `chunk`.

An example `m.reference` would be:
For example, given an event with the following `m.reference` relationship:

```json
{
Expand All @@ -38,7 +39,7 @@ An example `m.reference` would be:
}
```

The [bundle](#aggregations) under `m.relations` would appear similar to the following:
The aggregation would appear similar to the following:

```json
{
Expand Down
47 changes: 30 additions & 17 deletions content/client-server-api/modules/threading.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,14 @@ Clients SHOULD render threads differently to regular messages or replies in the
as by providing some context to what is going on in the thread but keeping the full conversation
history behind a disclosure.

Threads are established using a `rel_type` of `m.thread` and reference the *thread root* (the
first event in a thread). It is not possible to create a thread from an event with a `rel_type`,
which includes not being able to nest threads. All conversation in a thread reference the thread
root instead of the most recent message, unlike rich reply chains.
Threads are established using a `rel_type` of `m.thread` and reference the
*thread root* (the first event in a thread). It is not possible to create a
thread from an event which itself is the child of an event relationship (i.e.,
one with an `m.relates_to` property). It is therefore also not possible to nest
threads. All events in a thread reference the thread root instead of the
most recent message, unlike rich reply chains.

As a worked example, the following represents a thread and how it'd be formed:
As a worked example, the following represents a thread and how it would be formed:

```json
{
Expand Down Expand Up @@ -128,11 +130,11 @@ clients is used to create a reply within a thread: clients should render the eve

##### Validation of `m.thread` relationships

Servers SHOULD reject client requests which attempt to start a thread off an event with a
`rel_type`. If the client attempts to target an event which already has an `m.thread`,
`m.reference`, or any other `rel_type` then it should receive a HTTP 400 error response
with appropriate error message, as per the [standard error response](#standard-error-response)
structure.
Servers SHOULD reject client requests which attempt to start a thread off an
event with an `m.relates_to` property. If the client attempts to target an event which itself
has an `m.relates_to` property, then it should receive a HTTP 400 error
response with appropriate error message, as per the [standard error
response](#standard-error-response) structure.

{{% boxes/note %}}
A specific error code is not currently available for this case: servers should use `M_UNKNOWN`
Expand All @@ -141,12 +143,16 @@ alongside the HTTP 400 status code.

##### Server-side aggregation of `m.thread` relationships

Given threads always reference the thread root, an event can have multiple "child" events which
then form the thread itself. These events should be [aggregated](#aggregations) by the server.
Given threads always reference the thread root, an event can have multiple
"child" events which then form the thread itself. These events should be
[aggregated](#aggregations-of-child-events) by the server.

The aggregation for threads includes some information about the user's participation in the thread,
the approximate number of events in the thread (as known to the server), and the most recent event
in the thread (topologically). This is then bundled into the event as `m.thread`:
in the thread (topologically).

As with any other aggregation of child events, the `m.thread` aggregation is
included under the `m.relations` property in `unsigned` for the thread root. For example:

```json
{
Expand All @@ -165,6 +171,11 @@ in the thread (topologically). This is then bundled into the event as `m.thread`
"content": {
"msgtype": "m.text",
"body": "Woo! Threads!"
},
"unsigned": {
"m.relations": {
// ...
}
}
},
"count": 7,
Expand All @@ -178,15 +189,17 @@ in the thread (topologically). This is then bundled into the event as `m.thread`
`latest_event` is the most recent event (topologically to the server) in the thread sent by an
un-[ignored user](#ignoring-users).

Note that any bundled aggregations on `latest_event` should also be present. The server should be
careful to avoid loops, though loops are not currently possible due to `m.thread` not being possible
to target an event with a `rel_type` already.
Note that, as in the example above, child events of the `latest_event` should
themselves be aggregated and included under `m.relations` for that event. The
server should be careful to avoid loops, though loops are not currently
possible due to `m.thread` not being permitted to target an event with an
`m.relates_to` property.

`count` is simply the number of events using `m.thread` as a `rel_type` pointing to the target event.
It does not include events sent by [ignored users](#ignoring-users).

`current_user_participated` is `true` when the authenticated user is either:
1. The `sender` of the event receiving the bundle (they sent the thread root).
1. The `sender` of the thread root event.
2. The `sender` of an event which references the thread root with a `rel_type` of `m.thread`.

#### Querying threads in a room
Expand Down
7 changes: 3 additions & 4 deletions data/api/client-server/threads_list.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,7 @@ paths:
x-addedInMatrixVersion: "1.4"
summary: Retrieve a list of threads in a room, with optional filters.
description: |-
Paginates over the thread roots in a room, ordered by the `latest_event` of each thread root
in its bundle.
Paginates over the thread roots in a room.
operationId: getThreadRoots
security:
- accessToken: []
Expand Down Expand Up @@ -85,8 +84,8 @@ paths:
chunk:
type: array
description: |-
The thread roots, ordered by the `latest_event` in each event's aggregation bundle. All events
returned include bundled [aggregations](/client-server-api/#aggregations).
The thread roots, ordered by the `latest_event` in each event's aggregated children. All events
returned include bundled [aggregations](/client-server-api/#aggregations-of-child-events).
If the thread root event was sent by an [ignored user](/client-server-api/#ignoring-users), the
event is returned redacted to the caller. This is to simulate the same behaviour of a client doing
Expand Down
43 changes: 23 additions & 20 deletions meta/documentation_style.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,33 +44,36 @@ Stylistic notes
General
~~~~~~~

Try to write clearly and unambiguously. Remember that many readers will not
have English as their first language.
* Try to write clearly and unambiguously. Remember that many readers will not
have English as their first language.

Prefer British English (colour, -ise) to American English.
* Prefer British English (colour, -ise) to American English.

The word "homeserver" is spelt thus (rather than "home server", "Homeserver",
or (argh) "Home Server"). However, an identity server is two words.
* The word "homeserver" is spelt thus (rather than "home server", "Homeserver",
or (argh) "Home Server"). However, an identity server is two words.

An "identity server" (spelt thus) implements the Identity Service API (also spelt
thus). However, "Application Services" (spelt thus) implement the Application Service
API. Application Services should not be called "appservices" in documentation.
* An "identity server" (spelt thus) implements the Identity Service API (also spelt
thus). However, "Application Services" (spelt thus) implement the Application Service
API. Application Services should not be called "appservices" in documentation.

.. Rationale: "homeserver" distinguishes from a "home server" which is a server
you have at home. "identity server" is clear, whereas "identityserver" is
horrible.
.. Rationale: "homeserver" distinguishes from a "home server" which is a server
you have at home. "identity server" is clear, whereas "identityserver" is
horrible.
Lists should:
* Lists should:

* Be introduced with a colon.
* Be used where they provide clarity.
* Contain entries which start with a capital and end with a full stop.
* Be introduced with a colon.
* Be used where they provide clarity.
* Contain entries which start with a capital and end with a full stop.

When talking about properties in JSON objects, prefer the word "property" to "field",
"member", or various other alternatives. For example: "this property will be set to
X if ...". Also avoid the term "key" unless you are specifically talking about the
*name* of a property - and be mindful of the scope for confusion with cryptographic
keys.
* When talking about properties in JSON objects, prefer the word "property" to "field",
"member", or various other alternatives. For example: "this property will be set to
X if ...". Also avoid the term "key" unless you are specifically talking about the
*name* of a property - and be mindful of the scope for confusion with cryptographic
keys.

* "i.e." (*id est*) is an abbreviation and hence is written with two full
stops. So is "e.g." (*exempli gratia*).

Changes between spec versions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down

0 comments on commit 3d3f28c

Please sign in to comment.