Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Docker Compose template for SPARQL endpoint and RDF browser. #97

Merged
merged 5 commits into from
Sep 26, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docker/.env.dist
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DBA_PASSWORD=changeme
14 changes: 14 additions & 0 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Simple template for a Docker Compose setup based on [Virtuoso](https://github.com/openlink/virtuoso-opensource/), providing an RDF store, SPARQL Endpoint, and RDF browser (among other available features) for a quick knowledge base deployment.

# Localize

1. Replace `rdf/example.ttl` with your knowledge base
2. Add your namespaces to `initdb.d/setup.sql`
3. Adapt `docker-compose.yml`
4. Copy `.env.dist` to `.env` and change `DBA_PASSWORD`

# Start to use

1. Run `docker compose up --build`
2. Test the SPARQL endpoint at [`http://localhost:8890/sparql`](http://localhost:8890/sparql)
2. Test the RDF browser at [`http://localhost:8080/ExInstance`](http://localhost:8080/ExInstance)
29 changes: 29 additions & 0 deletions docker/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
services:

virtuoso:
image: openlink/virtuoso-opensource-7
environment:
- DBA_PASSWORD=${DBA_PASSWORD}
- VIRT_DATABASE_ERRORLOGLEVEL=3
- VIRT_SPARQL_DEFAULTGRAPH=http://example.org
- VIRT_SPARQL_DEFAULTQUERY=select distinct * {?s ?p ?o.} LIMIT 100
- VIRT_PARAMETERS_DIRSALLOWED=., /usr/local/virtuoso-opensource/share/virtuoso/vad, /rdf
- VIRT_PLUGINS_-=-
volumes:
- ./rdf:/rdf:ro
- ./initdb.d:/opt/virtuoso-opensource/initdb.d:ro
ports:
- "127.0.0.1:8890:8890"
restart: unless-stopped

lodview:
image: ghcr.io/konradhoeffner/lodview:22.05
environment:
- LodViewendpoint=http://virtuoso:8890/sparql
- LodViewIRInamespace=http://example.org/
- LodViewhomeUrl=/ExInstance
ports:
- "127.0.0.1:8080:8080"
depends_on:
- virtuoso
restart: unless-stopped
9 changes: 9 additions & 0 deletions docker/initdb.d/setup.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
log_message('Setup: Activate CORS');
update DB.DBA.HTTP_PATH set HP_OPTIONS = serialize(vector('browse_sheet', '', 'noinherit', 'yes', 'cors', '*', 'cors_restricted', 0)) where HP_LPATH = '/sparql';
log_message('Setup: Declare namespaces');
DB.DBA.XML_SET_NS_DECL ('ex', 'http://example.org/', 2);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TallTed , is there a way to load the namespaces from Turtle files instead? This seems like a rather hokey way to do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any Turtle file may contain its own namespaces.

Those loaded via the (SQL) DB.DBA.XML_SET_NS_DECL() function are just the presets that are used when no declaration is found in a SPARQL query, as shown for the DBpedia instance.

Does that make this seem less hokey?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately no. I understand that it is just setting default namespaces, but it would be much simpler for a user if a Turtle or SPARQL file could simply be provided to do so. After all, that's likely where they'll come from anyway.

But this certainly isn't a show stopper. I was just wondering if it could be simplified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if that would even be possible to implement because in a real-world situation you often load multiple files, which may have conflicting namespaces and then it is not defined, which one would take precedence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may have conflicting namespaces

Pointer?
In practice - do prefix declarations (they aren't namespaces - that's XML!) get redeclared?

http://prefix.cc/

Copy link
Contributor Author

@KonradHoeffner KonradHoeffner Sep 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not sure what you mean with pointer, but I try to clarify what I mean and how I understand and use the terms, feel free to correct me with the right terms in case I use the wrong ones:

  • "prefix", for example "owl": the abbreviation of a namespace
  • "namespace", for example "http://www.w3.org/2002/07/owl#": the unabbreviated form a prefix is mapped to

DB.DBA.XML_SET_NS_DECL is used in setup.sql to define default prefix to namespace mappings, in case a SPARQL query does not declare the prefix.

When we load N-Triples files in setup.sql then we have no other choice but to do it this way, because N-Triple files don't contain prefixes.
However when we load Turtle or RDF/XML then the question is if it is possible to load them from there.
My argument against that, besides that as far as I know Virtuoso does not support this, is that it it does not make sense to implement it because of the following case:

Example

  1. setup.sql contains ld_dir_all ('/rdf/', '*.ttl', 'http://example.org');
  2. the rdf directory contains a.ttl and b.ttl
  3. a.ttl maps the empty prefix to http://example.org/ontology/
  4. b.ttl maps the empty prefix to http://example.org/resource/

Now the namespace mapping for the empty prefix in Virtuoso would be undefined (maybe unpredictable is a better word).

Copy link
Collaborator

@dbooth-boston dbooth-boston Sep 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's a good reason for disallowing it. The same problem already exists if the user does this:

DB.DBA.XML_SET_NS_DECL ('ex', 'http://example.org/ontology/', 2);
DB.DBA.XML_SET_NS_DECL ('ex', 'http://example.org/resource/', 2);

The tool could warn if a prefix is redefined differently, but should not warn if it is redefined to have the same value, because it's common to define the same prefixes (the same way) in different files.

Consider this a feature enhancement suggestion. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that at least in this case there can be a defined behavior because the order of statements is deterministic, while in the "*.ttl" example it is either undefined or at least less clear. But I have to pass this feature enhancement suggestion to @TallTed, because I'm not a Virtuoso developer :-)
I don't think there is a succinct way to implement this in the setup.sql script itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's only the empty prefix? (and ex: !)

For all others, what I see out there is strong consistency across datasets.

one would take precedence.

A PREFIX in Turtle can redefine a prefix mid file. So when loading or concatenated, last prefix wins. Prefixes don't affect the data. Only the presentation.

log_message('Setup: Load data');
ld_dir_all ('/rdf/', '*.ttl', 'http://example.org');
rdf_loader_run();
log_message('Setup: Finished');

20 changes: 20 additions & 0 deletions docker/rdf/example.ttl
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX ex: <http://example.org/>

ex:ExClass
a owl:Class ;
rdfs:comment "an example class"@en ;
rdfs:label "example class"@en .

ex:ExInstance
a ex:ExClass ;
ex:exProperty 5 ;
rdfs:comment "an example instance."@en ;
rdfs:label "example instance"@en .

ex:exProperty
a owl:DatatypeProperty ;
rdfs:domain ex:ExClass ;
rdfs:label "Beispielproperty"@de ,
"example property"@en .