Skip to content

Installation Instructions

Stéphane Campinas edited this page May 6, 2014 · 8 revisions

Getting Started

The following needs to be installed:

  • Tomcat63 or Tomcat7;
  • Maven3; tomcat6
  • Java 1.6.0_*;
  • a SPARQL endpoint with SPARQL1.1 support.

The following command creates the file sparql-summary-assembly.jar in the directory target/. It is used to compute the Data Graph Summary (DGS) and other operations on a SPARQL endpoint:

mvn --projects sparql-summary package assembly:assembly --also-make

A detailed explanation of the sparql-summary-assembly.jar command line interface is found in the README file, in the sparql-summary module.

Computing the Data Graph Summary

The DGS implementation in this distribution is SPARQL-based.

A high performance, hadoop-based, DGS implementation is also available. Please inquire if you are insterested.

The following command will create the DGS over your dataset in the named graph http://www.acme.org which lies in a SPARQL endpoint located at http://url/to/sparql/endpoint/. The output is a gzip compressed NTriple file /path/to/-summary.nt.gz It is strongly advised to give enough memory to prevent OOME, e.g., -Xmx2048m.

java -cp target/sparql-summary-assembly.jar org.sindice.summary.Pipeline
                                            --type HTTP
                                            --repository http://url/to/sparql/endpoint/
                                            --domain http://www.acme.org
                                            --outputfile /path/to/<data>-summary.nt.gz

A SPARQL endpoint must then be loaded with the computed DGS NTriples, under the named graph http://sindice.com/analytics.

Using Virtuoso

Using the CLI of the org.sindice.summary.Pipeline class, you need to use the following command line.

java -cp target/sparql-summary-assembly.jar org.sindice.summary.Pipeline
                                            --type VIRTUOSO
                                            --repository jdbc:virtuoso://<hostname>:1111
                                            --user UID
                                            --pass PWD
                                            --domain http://www.acme.org
                                            --outputfile /path/to/<data>-summary.nt.gz

In the repository URL, 1111 is the Virtuoso Server port. You can find yours in the virtuoso.ini file under the [Parameters] section. UID is a valid username to connect to the Virtuoso instance at <hostname>, and PWD the associated password.

Since the DGS computation is computationally demanding, you may want to increase the timeout parameter of Virtuoso by changing the following parameter in the virtuoso.ini file:

MaxQueryExecutionTime=X

where the X value is the time in seconds. The Virtuoso server needs then to be restarted after changing that file.

Configuring the Webapp

Stop Tomcat

sudo service tomcat6 stop

Edit Tomcat Properties

Set the sindice.home property with the path to a folder where the webapp is able to write configurations files and log outputs.

Edit the file /etc/default/tomcat6 and update the JAVA_OPTS property

JAVA_OPTS="-Dsindice.home=/path/to/sindice/home"

Set tomcat6 permissions on that folder,so that the deployed webapp can write in the sindice.home folder

sudo chown -R tomcat6:tomcat6 /path/to/sindice/home

Create the .war file

Edit the default XML config file of SPARQLed recommendation-servlet/src/main/resources/default-config.xml.

You have to set the URL to the SPARQL enpoint(s) where your data and the DGS graph sit. Here we assume that you have everything inside one endpoint, which is moreover accessible through HTTP.

Information relative to your data is set under the proxy tab. Information regarding the DGS is set under the recommender tag.

<backend>HTTP</backend>
<backendArgs>http://path/to/sparql/endpoint</backendArgs>

You can use other types of endpoint such as the Sesame Native repository. To do so, set the backend as NATIVE, and set the backendArgs to the Native repository path.

The sparqled.war file is created using the following command, and is located in the directory recommendation-servlet/target/:

mvn --projects recommendation-servlet --also-make package

Launching the SPARQLed Webapp

With the SPARQL endpoint ready and the configuration files correctly setup, we are now ready to launch the webapp!

Clean Previously Installed Webapp

Remove a previously deployed SPARQLed webapp, in order to avoid unexpected issues:

rm -rf /path/to/sindice/home/sparqled rm -rf $CATALINA_BASE/webapps/sparqled*

Run Tomcat

Copy the sparqled.war file to the tomcat base directory and start tomcat:

sudo cp recommendation-servlet/target/sparqled.war $CATALINA_BASE/webapps/

sudo service tomcat6 start

Using SPARQLed

If no problem, the SPARQLed webapp is available at:

http://localhost:8080/sparqled/

You can monitor the webapp by looking at these logs:

tail -f $CATALINA_BASE/logs/catalina.out tail -f /path/to/sindice/home/log/sparqled/sparqled.log

The page Capabilities of SPARQLed contains a full overview of what you can do.