Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(sdk): Introduce the
LinkedChunk
type.
This patch is a work-in-progress. It explores an experimental data structure to store events in an efficient way. Note: in this comment, I will use the term _store_ to mean _database_ or _storage_. The biggest constraint is the following: events can be ordered in multiple ways, either topological order, or sync order. The problem is that, when syncing events (with `/sync`), or when fetching events (with `/messages`), we **don't know** how to order the newly received events compared to the already downloaded events. A reconciliation algorithm must be written (see matrix-org#3058). However, from the “storage” point of view, events must be read, written and re-ordered efficiently. The simplest approach would be to use an `order_index` for example. Every time a new event is inserted, it uses the position of the last event, increments it by one, and done. However, inserting a new event in _the middle_ of existing events would shift all events on one side of the insertion point: given `a`, `b`, `c`, `d`, `e`, `f` with `f` being the most recent event, if `g` needs to be inserted between `b` and `c`, then `c`, `d`, `e`, `f`'s ordering positions need to be shifted. That's not optimal at all as it would imply a lot of updates in the store. Example of a relational database: | ordering_index | event | |----------------|-------| | 0 | `a` | | 1 | `b` | | 2 | `g` | | 3 | `c` | | … | … | An insertion can be O(n), and it can happen more frequently than one can think of. Let's imagine a permalink to an old message: the user opens it, a couple of events are fetched (with `/messages`), and these events must be inserted in the store, thus potentially shifting a lot of existing events. Another example: Imagine the SDK has a search API for events; as long as no search result is found, the SDK will back-paginate until reaching the beginning of the room; every time there is a back-pagination, a block of events will be inserted: there is more and more events to shift at each back-pagination. OK, let's forget the `order_index`. Let's use a linked list then? Each event has a _link_ to the _previous_ and to the _next_ event. Inserting an event would be at worst O(3) in this case: if the previous event exists, it must be updated, if the next event exists, it must be updated, finally, insert the new event. Example with a relational database: | previous | id | event | next | |----------|---------|-------|---------| | null | `id(a)` | `a` | `id(b)` | | `id(a)` | `id(b)` | `b` | `id(c)` | | `id(b)` | `id(c)` | `c` | null | This approach ensures a fast _writing_, but a terribly slow _reading_. Indeed, reading N events require N queries in the store. Events aren't contiguous in the store, and cannot be ordered by the database engine (e.g. with `ORDER BY` for SQL-based database). So it really requires one query per event. That's a no-go. In the two scenarios above, another problem arises. How to represent a gap? Indeed, when new events are synced (via `/sync`), sometimes the response contains a `limited` flag, which means that the results are _partial_. Let's take the following example: the store contains `a`, `b`, `c`. After a long offline period (during which the room has been pretty active), a sync is started, which provides the following events: `x`, `y`, `z` + the _limited_ flag. The app is killed and reopened later. The event cache store will contain `a`, `b`, `c`, `x`, `y`, `z`. How do we know that there is a hole/a gap between `c` and `x`? This is an important information! When `z`, `y` and `x` are displayed, and the user would like to scroll up, the SDK must know that it must back-paginate before providing `c`, `b` and `a`. So the data structure we use must also represent gaps. This information is also crucial for the events reconciliation algorithm. What about a mix between the two? Here is _Linked Chunk_. A _linked chunk_ is like a linked list, except that each node is either a _Gap_ or an _Items_. A _Gap_ contains nothing, it's just a gap. An _Items_ contains _several_ events. A node is called a _Chunk_. A _chunk_ has a maximum size, which is called a _capacity_. When a chunk is full, a new chunk is created and linked appropriately. Inside a chunk, an ordering index is used to order events. At this point, it becomes a trade-off the find the appropriate chunk size to balance the performance between reading and writing. Nonetheless, if the chunk size is 50, then reading events is 50 times more efficient with a linked chunk than with a linked list, and writing events is at worst O(49), compare to the O(n - 1) of the ordering index. Example with a relational database. First table is `events`, second table is `chunks`. | chunk id | index | event | |----------|-------|-------| | `$0` | 0 | `a` | | `$0` | 1 | `b` | | `$0` | 2 | `c` | | `$0` | 3 | `d` | | `$2` | 0 | `e` | | `$2` | 1 | `f` | | `$2` | 2 | `g` | | `$2` | 3 | `h` | | chunk id | type | previous | next | |----------|-------|----------|------| | `$0` | items | null | `$1` | | `$1` | gap | `$0` | `$2` | | `$2` | items | `$1` | null | Reading the last chunk consists of reading all events where the `chunk_id` is `$2` for example, and contains events `e`, `f`, `g` and `h`. We can sort them easily by using the `event_index` column. The previous chunk is a gap. The previous chunk contains events `a`, `b`, `c` and `d`. Being able to read events by chunk clearly limit the amount of reading and writing in the store. It is also close to what will be really done in real life with this store. It also allows to represent gaps. We can replace a gap by new chunk pretty easily with few writings. A summary: | Data structure | Reading | Writing | |----------------|-------------------|-----------------| | Ordering index | “O(1)”[^1] (fast) | O(n - 1) (slow) | | Linked list | O(n) (slow) | O(3) (fast) | | Linked chunk | O(n / capacity) | O(capacity - 1) | This patch contains a draft implementation of a linked chunk. It will strictly only contain the required API for the `EventCache`, understand it _is not_ designed as a generic data structure type. [^1]: O(1) because it's simply one query to run; the database engine does the sorting for us in a very efficient way, particularly if the `ordering_index` is an unsigned integer.
- Loading branch information