Skip to content
der edited this page Mar 28, 2013 · 4 revisions

A collection of usage notes and question/answers that might be helpful for future development, deployment or usage.

What’s the motivation for the registry?

The design notes define a set of capabilities for the registry but don’t explain why those are needed. We do need some good material on the motivation and usage of the registry but in the meantime here’s a brief place holder.

At its heart, the purpose of the registry is to enable coordinated publishing and use of data across organizations. A key part of coordination is to use agreed shared terms for things, where terms includes codes, code lists, spatial object identifiers, namespaces and vocabulary elements. In particular, in many domains we need a notion of completeness – these terms, and only these terms are valid for this usage. This is needed both for error checking on publication and so that consuming applications can be certain they can cover all cases. While this sounds opposed to the whole open world nature of RDF and Linked Data, it is not that black & white. There’s no implication that everything be closed, simply that there are cases where closed world validation is useful and indeed necessary to create a trusted infrastructure.

So the purpose of the registry is to allow some set of organizations to allocate and publish terms, without clashing, and to discover and check the status of terms in use.

In a Linked Data world we would like all our terms to be identified by URIs and we’d like those URIs to dereference returning useful data describing the term. Which means that in such a world a registry has to create and manage a shared URI namespace and provide access to published RDF associated with terms.

It’s possible that this could be achieved with just a PURL-like service where the registry simply redirects parts of a URI namespace to some published data but didn’t itself hold any of that data. The linked data registry goes further than this for a couple of reasons.

Firstly, a key requirement of the registry is hold metadata about terms (their status, who published them etc) and historical records of them. That metadata is about the disposition of the registry organization to the term, not the term itself. Clearly all that metadata should itself be Linked Data so the registry has to also be a repository of at least metadata. Since there’s basically no difference between data and metadata once we have an API for registration which includes the ability to publish and access the metadata we already have something that can act as a repository or publishing platform.

Secondly, for many organizations the cost of publishing reference data as Linked Data is too great if they don’t already have a suitable infrastructure of their own. By allowing such organizations to publish terms directly into the repository we greatly reduce the barrier to use of URIs for term identifiers.

Looking at it the other way, by starting with a set of requirements for a registry (including holding metadata and optionally some actual data) we’ve ended up with a very general design. It is a platform that provides a hierarchical collection of Linked Data resource, along with management of those resources including metadata, versioning and history; and which allows for sharing of namespaces between organizations. Think what you could do with a platform like that – whether or not you call it a registry.

Deprecation of terms referenced from multiple registers

As noted in Issue 13 there are situations where the same term is used in different registers. How can deprecation of such terms be handled?

As discussed in the comments there, the status of a term is relative to a particular register. So the same term might be deprecated for use in one situation but valid for use in another. In cases where all uses of a term should be withdrawn then the registry API provides the ?entity=<..> command for locating all references to a given entity. Tools may use this facility to locate, and change the status of, all uses of a term.

Consequences of proxying

Using a DelegatedRegister or a NamespaceForward with code 200 should be done with care. These are primarily useful in situations where the service being mapped to is designed to be accessed from the registry instance this work. For example its data URIs are based in the registry namespace.

Among points to watch out for are:

  • If proxy-style forwarding to an Elda service then the registry address will act like a staging site and the page URLs will be rewritten to point to the registry address. See Issue 22
  • If proxy-style forwarding to pages which embedded services limited by an API key then the API may break. For example LDA endpoints often embed OpenSpaces map services which use an API key tied to the referrer. Either the API key needs to be suitable for use from the registry namespace or a redirect rather than proxy mode should be used.

Multilingual content

RegisterItems and entities may be labelled and described using text in different languages using standard RDF support for language-tagged literals.

The registry UI will choose which language version to show for any given label or text based on the Accept-Language header in the request (which for a browser is typically set as part of defining the user’s locale).

Note that when generating multilingual content for registration then care must be take with character set encodings. In particular the text/turtle format specifies text should be encoded using utf-8. If manually editing data files then ensure then your editor is set to use utf-8 encoding.