-
Notifications
You must be signed in to change notification settings - Fork 21
Trellis Architecture
The Trellis API unites the concepts of a Key-Value store and the interaction models of a LDP server. This makes it possible for a particular implementation to scale horizontally while supporting a standards-based model for managing resources.
In this context, the relevant methods in the Trellis API are:
/* Retrieve a resource */
CompletionStage<Resource> get(IRI);
/* Update a resource with the provided data */
CompletionStage<Void> replace(Metadata, Dataset);
/* Delete a resource */
CompletionStage<Void> delete(Metadata);
That is, get
and replace
are the means by which resources are retrieved and manipulated. The semantics of get
and replace
are also idempotent and scoped to a single resource. Even the non-idempotent HTTP methods (POST and PATCH) are decomposed in the HTTP layer to the much simpler replace
method of the resource service.
By relying on replace
for the manipulation of all resources, an implementation can treat a resource IRI as an opaque key, independent of any implied hierarchy. This means that, in a distributed context, the data of some resource /foo
may be stored on one set of servers while /foo/bar
is stored on an entirely different set of servers.
Making this architectural choice is the cornerstone of Trellis' ability to scale horizontally. It also introduces some restrictions on the behaviors a client can expect.
A single HTTP GET
request from a client will likely be accessing data from multiple sources. Trellis resources should be considered eventually consistent in the sense that, under load, there are no transactional guarantees that a response is consistent with respect to the response headers and body. In other words, the headers of a response may be out of date by the time the body of the resource has been fetched.
There is no support for recursion in Trellis. In the context of a hierarchical datastore, that means recursive PUT and recursive DELETE are not available. For instance, given an empty Container at the server root, if a client were to create a resource at /foo/bar/baz
, none of the intermediate containers would be created. If /foo/bar
is subsequently created as a Container, one would find a </foo/bar> ldp:contains </foo/bar/baz> .
triple, but a client will need to explicitly create such a container.
Similarly, given a hierarchy of resources, starting at /foo
, a DELETE command issued for /foo
will only affect /foo
. It will not trigger the deletion of any child or other descendent resources; a client will need to explicitly delete those resources.
In order to properly implement a recursive PUT or DELETE, a server would require a strong notion of consistency and atomicity of the underlying datastore. This is typically not a problem in a single-node (especially RDBMS) context, but it can be problematic for distributed systems where consistency cannot be taken for granted. Therefore, in order to support the general case of recursive PUT or DELETE over a distributed datastore, an extensive locking regimen would need to be in effect for every such PUT or DELETE operation. So for operations that would ordinarily be very efficient, these operations would become considerable bottlenecks for systems expecting even modest levels of concurrency.
Therefore, Trellis does not support recursion of any sort.
This decision is also in line with the principles of REST, where idempotent operations on one resource are constrained to that one resource. Any deviation Trellis makes from that principle relates to existing LDP requirements on the behavior of various container types, and those operations are typically handled in an asynchronous fashion anyway.
Another implication of this is that a client can easily create a disconnected graph of resources. If that is a concern, then a client should only use POST to create resources. DELETE operations will require some level of coordination by the client, especially if it is running in a multi-threaded or multi-processor environment.