-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC3089: File tree structures #3089
Draft
turt2live
wants to merge
3
commits into
old_master
Choose a base branch
from
travis/msc/trees
base: old_master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,235 @@ | ||
# MSC3089: File trees | ||
|
||
Files are currently shared by uploading them to the media repo and putting a reference to that content | ||
in a room message. This is fine for most use cases, such as sharing screenshots or short-lived documents, | ||
however longer term, collaborative, structures are not quite possible. | ||
|
||
This MSC defines an approach for defining data trees in Matrix, using a document hierarchy as an example | ||
for how it could be applied. | ||
|
||
Reading material: | ||
* [MSC1772 - Spaces + Room types](https://github.com/matrix-org/matrix-doc/pull/1772) | ||
* [MSC3088 - Room subtyping](https://github.com/matrix-org/matrix-doc/pull/3088) | ||
* [MSC1767 - Extensible events](https://github.com/matrix-org/matrix-doc/pull/1767) | ||
|
||
Optional but useful reading: | ||
* [MSC1840 - Alternative room types](https://github.com/matrix-org/matrix-doc/pull/1840) | ||
* [MSC2326 - Label-based filtering ("threading")](https://github.com/matrix-org/matrix-doc/pull/2326) | ||
* [MSC2674 - Event relationships](https://github.com/matrix-org/matrix-doc/pull/2674) | ||
* [MSC2676 - Message editing](https://github.com/matrix-org/matrix-doc/pull/2676) | ||
* [MSC2946 - Spaces summary](https://github.com/matrix-org/matrix-doc/pull/2946) | ||
* [MSC2962 - Space group access control](https://github.com/matrix-org/matrix-doc/pull/2962) | ||
* [MSC2753 - Proper peeking](https://github.com/matrix-org/matrix-doc/pull/2753) | ||
* [Spec - Withholding encryption keys](https://spec.matrix.org/unstable/client-server-api/#reporting-that-decryption-keys-are-withheld) | ||
|
||
## Proposal | ||
|
||
*Author's note: This proposal assumes the reader is familiar with the terminology of the reading | ||
materials mentioned above.* | ||
|
||
We introduce a new room subtype, `m.data_tree`, to be applied to spaces to denote that they are | ||
data-driven trees. The subtype only needs to be applied to a parent space to affect all subspaces | ||
of that space. For a file hierarchy, the room name for the spaces are the directory names. Note | ||
that this subtype is *optional* and serves only to hide the tree from conversation-focused clients. | ||
|
||
Spaces used in a tree-like way (with the `m.data_tree` subtype or not) are called "tree spaces" in | ||
this proposal. | ||
|
||
The context of the tree space denotes what it is representing. The 3 major expected types are: | ||
|
||
1. A standalone data tree. This should be annotated with the `m.data_tree` subtype, and would represent | ||
a shared directory of sorts, possibly shared publicly. This is similar to sending a share link | ||
to a directory in a file syncing service (ie: Dropbox). | ||
2. A data tree as part of a space, but not mirroring the structure of that space. This would also | ||
have the `m.data_tree` subtype, and would best represent a shared drive within that space. | ||
3. The space itself with no subtyping. This usually indicates that the space is structured such that | ||
people can browse files uploaded anywhere for easier exploration. This is expected to be used | ||
in conjunction with case 2. This case would end up potentially replacing the "Files" panel in | ||
many conversational clients. | ||
|
||
A limited example of what this would look like is (📂 denotes `m.data_tree` space, `📄` denotes a | ||
file/leaf (described later), and `+` denotes a Space): | ||
|
||
``` | ||
+ Acme Co. | ||
+ Sales Team | ||
+ 📂 Quarterly objectives | ||
+ 📂 Q1 2021 | ||
- 📄 Targets | ||
- 📄 End of quarter report | ||
+ 📂 Q2 2021 | ||
- 📄 Targets | ||
+ 📂 Q3 2021 | ||
+ 📂 Q4 2021 | ||
+ HR | ||
- 📄 WIP: Time off requests v2 | ||
+ 📂 Personnel files | ||
+ 📂 Policies | ||
- 📄 Time off requests | ||
+ 📂 Contract templates | ||
``` | ||
|
||
In the example, the sales team has set up a subspace to hold all of their files and folders ("case 2" | ||
from above). Access control would likely be a subset of Acme Co.'s members, limited to the sales team | ||
space specifically. The HR space has a similar structure, though has decided to use a room which is *not* | ||
subtyped to `m.data_tree` to upload some work-in-progress policies. The HR team also has a shared drive | ||
which would almost certainly have space-defined access control. | ||
|
||
In both team's cases, clients would not render the 📂 trees as browseable in a room list (typically). The | ||
client would likely expose a "View files" button which then takes the user to a file browser of sorts | ||
for the user to explore. The WIP policies would likely show up in the "Files" panel of the client, where | ||
a link to explore the 📂 trees. | ||
|
||
Tree spaces may contain non-space rooms under them to help perform access control. For example: | ||
|
||
``` | ||
+ My Folder | ||
- Regular room 1 | ||
- Regular room 2 | ||
+ 📂 Subfolder 1 | ||
``` | ||
|
||
When this happens, the rooms are treated effectively as more buckets under that parent node. In the above | ||
example this would mean that anything posted to either "Regular room" would be listed under "My Folder" | ||
instead. This is expected to be a rare choice of data structure, though can theoretically be used to | ||
group files within a directory for simpler access control. Note that the "My Folder" space does not need | ||
to be subtyped to have this happen. | ||
|
||
Files are represented as room events either in the tree room (or in any non-space room under that tree | ||
space). This is done by exposing a generic `m.leaf` type which is purely intended to be used to encourage | ||
proper rendering within the extensible events scheme. | ||
|
||
This intentionally does not use state events to represent each leaf as encrypted state events are not possible | ||
currently. Other MSCs may wish to optimize the lookup of regular room events, though for now the intention | ||
is that clients would parse events themselves. | ||
|
||
A file would look something like this (when using extensible events): | ||
|
||
```json5 | ||
{ | ||
"type": "m.file", | ||
"content": { | ||
"m.text": "targets.docx (12 KB)", | ||
"m.file": { | ||
"url": "mxc://example.org/abc123", | ||
"name": "targets.docx", | ||
"mimetype": "application/vnd.openxmlformats-officedocument.wordprocessingml.document", | ||
"size": 12000 | ||
}, | ||
turt2live marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"m.leaf": {} | ||
} | ||
} | ||
``` | ||
|
||
Note that this would allow non-file types to be included in the tree. Clients would filter out anything | ||
that doesn't make sense for their use case, such as ignoring text-only events. Events missing the `m.leaf` | ||
description would be excluded as non-leaf events. No content for `m.leaf` is currently defined - clients | ||
can interpret labels for files/other types through the extensible events format. | ||
|
||
The `m.leaf` type is intentionally not used as the event type as fundamentally the user is uploading a file, | ||
not a leaf. The leaf is essentially metadata on the event to describe how it could be rendered by clients | ||
behaving in a suitable way. Otherwise, it's deliberate that the event shows up as a regular file upload in | ||
the room. | ||
|
||
Events can be edited to add the `m.leaf` metadata, adding them to the tree. | ||
|
||
Since room events can be encrypted, it can mean that the `m.leaf` metadata gets encrypted too. This could | ||
potentially make it harder on clients/servers to find *just* leaf events. As a workaround, clients can | ||
include the `m.leaf` metadata to the encrypted `content` so it can be found by servers. Clients MUST still | ||
include an encrypted copy in the event content, which clients MUST prefer over the plaintext version. As | ||
an example, this could look like (keys will not be accurate): | ||
|
||
```json5 | ||
// Encrypted event | ||
{ | ||
"type": "m.file", | ||
"content": { | ||
"algorithm": "m.megolm.v1.aes-sha2", | ||
"ciphertext": "Awga...oEkC", | ||
"device_id": "UCCUUHBQQM", | ||
"sender_key": "Vn+E+aPjvlbf14j1OWCIe5IlkTLZ4Zft628Mw8RysG4", | ||
"session_id": "uXWJgrndwkutoKQVqsTsdamRDKqBAkgBawjeqaB+81s", | ||
"m.leaf": {} | ||
} | ||
} | ||
``` | ||
|
||
```json5 | ||
// Decrypted copy of event | ||
{ | ||
"type": "m.file", | ||
"content": { | ||
"m.text": "targets.docx (12 KB)", | ||
"m.file": { | ||
"url": "mxc://example.org/abc123", | ||
"name": "targets.docx", | ||
"mimetype": "application/vnd.openxmlformats-officedocument.wordprocessingml.document", | ||
"size": 12000 | ||
}, | ||
"m.leaf": { | ||
"com.example.custom_field": true | ||
} | ||
} | ||
} | ||
``` | ||
|
||
Note how the encrypted event excludes the custom field but the decrypted copy does not. This is to ensure | ||
there is no unnecessary disclosure of information. Clients MUST NOT trust the `m.leaf` in the encrypted | ||
event and must only consider the decrypted copy's `m.leaf`. This is to ensure that an `m.leaf` is *always* | ||
present on an event that needs it, as some clients might optimize out the `m.leaf` without carrying it over. | ||
|
||
**TODO: Decide on index versus the above (`m.leaf` accessible by server). Index is below.** | ||
|
||
The client is expected to maintain a "branch" structure in the room state, denoting the active files and | ||
where to find those files. This is done through a `m.branch` state event, where the state key is the event | ||
ID of the file. An `m.branch` event looks like this: | ||
|
||
```json5 | ||
{ | ||
"type": "m.branch", | ||
"state_key": "$event", | ||
"content": { | ||
"active": true | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. note for mostly myself: we include a file name here to make simple edits easier. edit: as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also |
||
} | ||
} | ||
``` | ||
|
||
When `active` is not exactly `true`, the file is considered invalid/inactive. Clients should ignore inactive | ||
files. Clients should take reasonable efforts to resolve the latest version of a file: an edited file event | ||
shouldn't need `m.branch` switching. | ||
|
||
For some common operations: | ||
* Deleting a file would mean redacting the event. | ||
* Updating a file could mean editing it, or redacting and re-sending. | ||
* Changing view file permissions could mean using an encrypted room and withholding keys. | ||
* Changing upload permissions would mean altering power levels. | ||
* Renaming a file would mean editing the label. | ||
* Moving a file would mean redacting and re-sending in the right tree. | ||
* Comments/notes on a file could be threads off the file. | ||
* Anonymous browsing would be peeking into the various rooms. | ||
|
||
Implementation-wise, the following may be useful: | ||
* Using space summaries to render the directory structure. | ||
* Peeking to get file listings. | ||
* Group access controls to control who can (and can't) upload/view files. | ||
* Encryption to protect files and get finer control over visibility. | ||
* History visibility and join rules to manage publicity of the files. | ||
* Room directory for discovering public file shares. | ||
|
||
## Potential issues | ||
|
||
***TODO*** | ||
|
||
## Alternatives | ||
|
||
***TODO*** | ||
|
||
## Security considerations | ||
|
||
***TODO*** | ||
|
||
## Unstable prefix | ||
|
||
While this MSC is not in a stable version of the specification, implementations should | ||
use `org.matrix.msc3089.` in place of `m.` - this means, for example, `org.matrix.msc3089.leaf` | ||
as an identifier. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say "room subtype" here, should I assume MSC3088
m.room.purpose
state events with a state key ofm.data_tree
? (There's enough MSCs that mention types floating around that I'm having trouble distinguishing them all...)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, sorry. Will leave this open as a reminder to clarify.