Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3391: API to delete account data #3391

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
101 changes: 101 additions & 0 deletions proposals/3391-account-data-delete.md
turt2live marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# MSC3391: Deleting account data
ShadowJonathan marked this conversation as resolved.
Show resolved Hide resolved
anoadragon453 marked this conversation as resolved.
Show resolved Hide resolved

Currently, in Matrix, there is no semantic way to "remove" or "delete" account data. There is only a
method to create and update it, them being effectively the same operation, but no method to "remove"
it.

This violates the CRUD principle, and disallows "cleaning up" account data, or de-reference certain
keys used for certain actions, as then they will stay around forever.

## Proposal

This proposal aims to:
- Define "`{}`" account data content as effectively "deleted", alongside out-of-band deletion semantics.
- Allow `DELETE` HTTP methods on account data endpoints to perform the deletion.
Comment on lines +13 to +14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not entirely clear to me that we should have both of these mechanisms. My experience is that, where the protocol allows two ways of doing a thing, they will end up being subtly different somehow and it leads to compatibility problems (eg, one server implementation will forget to implement DELETE because the client they test with only uses PUT).

Given that we don't have any special server-side semantics for PUT {}, can we stick with that situation, and only do the special "deletion" behaviour for an actual DELETE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this MSC, {} effectively is being equated to "Deleted", reversing that in many aspects would mean that suddenly the server can return either {} or a 404, being the difference between null and "", to grab an example/analogy.

The primary benefit of using DELETE over PUT {} is that backwards compatibility is held, the semantics is made more clear (would you expect a PUT {} to delete the key entirely, such that a 404 is returned), and "alternate ways" of doing the same thing in a way that is confusing is avoided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative, should we define PUT {} to return http 400, at the minuscule risk of breaking existing clients?


The first, defining `{}` as "deleted", is to codify this defacto "nullification" into the spec as
expected behavior for deletion. This paves the way for better client interpretation of that data,
and would allow backwards compatibility when clients do `PUT` `{}` as account data. (More on this in
the dedicated section)

### Endpoints
ShadowJonathan marked this conversation as resolved.
Show resolved Hide resolved

Account data and room data currently have the following endpoints:

Global Account Data,
[`GET`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixclientv3useruseridaccount_datatype)
and
[`PUT`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3useruseridaccount_datatype):
```
GET /_matrix/client/v3/user/{userId}/account_data/{type}

PUT /_matrix/client/v3/user/{userId}/account_data/{type}
```

And Room-specific Account Data,
[`GET`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixclientv3useruseridroomsroomidaccount_datatype)
and
[`PUT`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3useruseridroomsroomidaccount_datatype):
```
GET /_matrix/client/v3/user/{userId}/rooms/{roomId}/account_data/{type}

PUT /_matrix/client/v3/user/{userId}/rooms/{roomId}/account_data/{type}
```

This proposal adds the following two endpoints (with no request body):
```
DELETE /_matrix/client/v3/user/{userId}/account_data/{type}

DELETE /_matrix/client/v3/user/{userId}/rooms/{roomId}/account_data/{type}
```
ShadowJonathan marked this conversation as resolved.
Show resolved Hide resolved

These, respectively, remove account-wide account data, and room-scoped account data.
ShadowJonathan marked this conversation as resolved.
Show resolved Hide resolved

For idempotency reasons, these endpoints always return `200 OK`, with an empty JSON body `{}`.

For any account data key that cannot be set (like `m.fully_read`), the delete API will have a likewise error response.

These endpoints are authenticated, and can be rate-limited.

#### Deleted account data responses

Furthermore, when a client deletes account data, it must expect the `GET` methods above to return a 404 on
the next request.
Copy link
Member

@anoadragon453 anoadragon453 Dec 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to document our particularly complex portion of the work when implementing this MSC in Synapse. Not a blocker or an issue with the feature (edit: and the problem is not unique to this feature either).

After an account data entry has been deleted, that change must be broadcast to all devices. Once that's been done, it would be ideal if we could remove the record of that account data type ever being set in the homeserver (for reduction of user data and (slight) storage benefits).

This gets a little complicated when you try to quantify at what point all devices have a user have seen the delete. One way to do this (what is currently planned for Synapse) is to track the devices that haven't yet seen the delete, and then tick them off the list once they have.

But what does it mean for a device to have seen an account data change? If they have /sync'd the change, we can likely say they've seen the change. We should be careful though, as the device may have error'd in processing the /sync response, and if it retries the request after we've deleted the row, then the response will not include the delete(!). Instead, we should only assume the client has successfully seen the delete when they make a subsequent /sync request with a since token that implies they've processed the delete (Synapse's since/next_batch tokens allow for this as they encode points in streams; other homeserver implementations may not have this property).

And what if a device requests GET /_matrix/client/v3/user/{userId}/account_data/{type} after a delete has occurred? Should we assume that the device has seen the delete? Initially I thought 'yes', but @reivilibre pointed out that this means a request to GET /_matrix/client/v3/user/{userId}/account_data/{type} could mean (if the device was the last to see the delete) that the delete would then not come down /sync - which feels like a massive footgun. As a result, we've decided not to assume a call to GET /_matrix/client/v3/user/{userId}/account_data/{type} means a device has seen a delete, and instead of use /sync requests.


All of this falls squarely into the area of implementation - but it may be useful to other homeserver implementations. None of this is an issue if the implementation chooses to keep "deleted" account data entries in their database forever.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what does it mean for a device to have seen an account data change?

Presumably servers can implement the same logic as for when they need to figure out whether a device has seen a to-device message (so that the server can delete it)? Especially as to-device messages have a better case for robustness than account data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - it is indeed very similar. The only difference is that account data has this extra way of being accessed (via GET /_matrix/client/v3/user/{userId}/account_data/{type}). So whether you also count that as seeing the change is the only decision you need to make if you've already settled on a design for deleting old to-device messages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this falls squarely into the area of implementation

I'm not convinced it does, entirely. It seems important to me that we specify when exactly a client can expect the 'deleted' account data to be served.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something obvious but how much is this problem different from any other change in account data? What if it's a good ol' PUT and some other device crashes at processing the /sync batch that has this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShadowJonathan this thread appears to be unresolved, and I think it's blocking FCP from starting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a result, we've decided not to assume a call to GET /_matrix/client/v3/user/{userId}/account_data/{type} means a device has seen a delete, and instead of use /sync requests.

Can we even tie access tokens to device IDs like this, or do we assume this per access token?

That brings up a whole can of worms for me too, and so i wish to assume that deletion of account data be a best-effort case only.

I think that enumerating and deleting account data only if all access tokens have seen it is not ideal from a performance perspective, since stale access tokens can "block" deletion of data like this, and it'd require a whole tracking system for what-access-token-has-seen-what in implementations, which i'm not aware exists right now.


Ultimately i'd wish to lean on @anoadragon453 for opinion on this; I think that /account_data/ should reflect the current state at any time, and that it does not interact or touch "if the device has seen it or not", since any access token would already be syncing if it sends account_data requests like this, since a client is expected to sync continuously when its online.

However, what about clients that are offline? Do we expect implementations to internally store tombstones to their stream tokens, to notify remaining access tokens of such a delete? Won't that go against the storage savings, or will it increase performance as it only queries a "small" table every time such a sync happens, to search for such tombstones?

What if a client re-uses an old sync token, after it has restarted sync, or used an additional sync? This behaviour is undefined, and with the case of deleting data, even the reference of itself that it has been deleted, it might become problematic.

And i concur with @richvdh, this is not implementation-specific, as then different implementations will have different ways of specifying how and when data is deleted.

Imo, it should be tied to /sync, and /account_data/ should always serve the current state, even if that results in a race condition between the two calls, as to which one returns the "this data has been deleted" state to the client first.


#### Backwards Compatibility

For backwards compatibility reasons, if a client `PUT`s `{}` for account data, that must be seen
equivalent as `DELETE`-ing that account data.

### Sync

Account Data changes are announced through sync; this proposal also aims to change this response slightly after account data deletion.

On incremental syncs (sync with `since`), in paths `account_data.events` and `rooms.join.{room_id}.account_data.events`,
a `{}` for event content must be interpreted as a deletion by the client.

These only occur in incremental syncs. An initial sync (without `sync`) must never contain keys with content `{}`,
even if the delete has just occurred.

## Alternatives

Instead of sending down deletions through `.events` as `{}`, we could use a new `.deleted_events`
to send down the keys of deleted events.

anoadragon453 marked this conversation as resolved.
Show resolved Hide resolved
## Security considerations

No considerable security problems, other than the fact that a user can potentially delete important data.

## Unstable prefix

When implementing this proposal, servers should use the `org.matrix.msc3391` unstable prefix:

```
DELETE /_matrix/client/unstable/org.matrix.msc3391/user/{userId}/account_data/{type}

DELETE /_matrix/client/unstable/org.matrix.msc3391/user/{userId}/rooms/{roomId}/account_data/{type}
```

The `unstable_features` key for this MSC is `org.matrix.msc3391`, it has to be set to `true`.

Any other value or absence of this key will signal the absence of support for this feature.