Skip to content

Persistence Layer Implementation

Aaron Coburn edited this page May 26, 2020 · 5 revisions

Trellis has been designed to make is straightforward to implement new persistence layers. Two existing persistence layers build on a Triplestore or a Relational Database.

The two main interfaces that a persistence layer needs to implement are org.trellisldp.api.ResourceService and org.trellisldp.api.AuditService. These can be joined into a single class or kept separate; either way is fine.

The methods that the ResourceService implementation needs to consider are:

CompletableFuture<Resource> get(IRI identifier);

There are some special resource types in Resource.SpecialResources for referring to missing or previously deleted resources. That is, an implementation should return SpecialResources.MISSING_RESOURCE for any ::get request for a non-existent resource. If the requested resource existed previously and has since been deleted, an implementation may return SpecialResources.DELETED_RESOURCE. However, since an implementation would need to keep track of deleted resource metadata for this, it is also quite acceptable for an implementation to simply return SpecialResources.MISSING_RESOURCE in this case.

CompletableFuture<Void> create(IRI identifier, IRI interactionModel, Dataset dataset, IRI container, Binary binary);

The ::create method will typically accept requests with null values for container and binary. A null binary value should be expected for all RDF-based resources. For LDP-NR resources, the binary object will contain metadata about the binary object.

It should be expected that the interactionModel value will be drawn from the LDP namespace.

CompletableFuture<Void> replace(IRI identifier, IRI interactionModel, Dataset dataset, IRI container, Binary binary);

The ::replace method is almost exactly the same as the ::create method, though it operates on pre-existing resources.

CompletableFuture<Void> delete(IRI identifier, IRI container);

The ::delete method will be passed a container value representing the resource's parent (or null if there is no such relationship). An implementation can choose to completely remove all trace of this resource, if desired; alternatively, an implementation may keep track of deleted resources so that subsequent requests return Resource.SpecialResources.DELETED_RESOURCE instead of Resource.SpecialResources.MISSING_RESOURCE.

Managing containment and membership triples

It is the responsibility of the persistence layer to keep track of containment and membership triples. This can be handled either synchronously or asynchronously. The ::create and ::replace methods will include a container attribute. This is a hint to the persistence layer to record this resource in the containment triples of the container resource. When this value is null, it can be understood that the resource has no container resource.

The containment and membership triples are part of the output when the Resource.stream method is invoked. If these triples are generated dynamically, the persistence layer will need some type of query mechanism. If the triples are stored on writes, then some form of (async) coordination will be required in the persistence layer: either a dedicated queue or a separate, coordinating thread.

Basic Containers

Supporting LDP Basic containment is not complicated. If triples are generated dynamically, it is common for resources to keep track of their parent resource. For instance, given a container, this type of query would be useful (the container IRI would be interpolated into the query):

SELECT id FROM resources WHERE container=?;

Direct Containers

Direct containers are somewhat more complicated than basic containers. When these resources are created or modified, the implementation will need to inspect the incoming Dataset, looking for user-managed triples with these predicates:

  • ldp:membershipResource
  • ldp:isMemberOfRelation
  • ldp:hasMemberRelation

Direct containers come in two flavors: those with an ldp:isMemberOfRelation property and those with ldp:hasMemberRelation. All will have an ldp:membershipResource property.

Each of these properties should be stored as metadata on the resource. It is possible that the ldp:membershipResource value will contain a hashURI, so it is typically a good idea to store both the original user-supplied value and a "cleaned" version of that resource IRI.

For direct containers that use ldp:hasMemberRelation, the queries fetching membership triples for a resource might look like

SELECT id, membershipResource, hasMemberRelation
   FROM resources
   WHERE member=? AND type='DirectContainer' AND hasMemberRelation IS NOT NULL;

This will find all of the direct containers pointing to the resource in question. Then, for each of the results (one can use a single INNER JOIN for SQL-based backends), a containment-style query should be issued for all the resources whose parents match the id value(s) from the query above. For each of these results, the triples will take the form:

<membershipResource value> <hasMemberRelation value> <child resourceIRI> .

For direct containers with ldp:isMemberOfRelation, the query is considerably easier. Given that a resource ought to know the identifier for the parent, the query would be similar to:

SELECT membershipResource, isMemberOfRelation
  FROM resources
  WHERE id=? AND type='DirectContainer' AND isMemberOfRelation IS NOT NULL;

Here, if there is a query result, that will generate a triple of the form on the child resource:

<child resourceIRI> <isMemberOfRelation value> <membershipResource value> .

Indirect containers

Indirect containers are considerably more complex and resource intensive. Indirect containers will include an additional parameter:

  • ldp:insertedContentRelation

If the value of ldp:insertedContentRelation is ldp:MemberSubject, then the indirect container will behave exactly as a direct container. One can take this into consideration for query efficiency. It is also worth noting that indirect containers will be less likely to use ldp:isMemberOfRelation. In fact, using that with a literal value is undefined, and using that with an out-of-domain IRI (e.g. http://example.com) does not make much sense. The existing Trellis persistence layers do not support indirect containers with inverted (ldp:isMemberOfRelation) relationships.

In order to support indirect containers, a persistence layer will need to be prepared to query across all the user-managed triples in all the child resources of the indirect container. Some form of secondary index or async processing is useful here.

As with direct containers, finding the membership triples for a resource involves a query to the indirect container that points to the resource in question:

SELECT id, membershipResource, hasMemberRelation, insertedContentRelation
   FROM resources
   WHERE member=? AND type='IndirectContainer' AND hasMemberRelation IS NOT NULL
     AND insertedContentRelation IS NOT NULL;

Then, for each result, one would need to find all contained resources for that id value. And for each contained resource, one would need to inspect the user-managed triples, looking for any triple with a predicate that equals insertedContentRelation. The objects of those triples will be used to create membership triples of this form:

<membershipResource value> <hasMemberRelation> <object of matching child triples> .

The trellis-jdbc component provides an RDBMS-based persistence layer for Trellis, and it implements these queries here.

The trellis-triplestore persistence layer implements these queries here.