-
Notifications
You must be signed in to change notification settings - Fork 6
Installation Instructions
The following needs to be installed:
- Tomcat63 or Tomcat7;
- Maven3; tomcat6
- Java 1.6.0_*;
- a SPARQL endpoint with SPARQL1.1 support.
The following command creates the file sparql-summary-assembly.jar in the directory target/. It is used to compute the Data Graph Summary (DGS) and other operations on a SPARQL endpoint:
mvn --projects sparql-summary package assembly:assembly --also-make
A detailed explanation of the sparql-summary-assembly.jar command line interface is found in the README file, in the sparql-summary module.
The DGS implementation in this distribution is SPARQL-based.
A high performance, hadoop-based, DGS implementation is also available. Please inquire if you are insterested.
The following command will create the DGS over your dataset in the named graph http://www.acme.org which lies in a SPARQL endpoint located at http://url/to/sparql/endpoint/. The output is a gzip compressed NTriple file /path/to/-summary.nt.gz It is strongly advised to give enough memory to prevent OOME, e.g., -Xmx2048m
.
java -cp target/sparql-summary-assembly.jar org.sindice.summary.Pipeline
--type HTTP
--repository http://url/to/sparql/endpoint/
--domain http://www.acme.org
--outputfile /path/to/<data>-summary.nt.gz
A SPARQL endpoint must then be loaded with the computed DGS NTriples, under the named graph http://sindice.com/analytics.
Using the CLI of the org.sindice.summary.Pipeline
class, you need to use the following command line.
java -cp target/sparql-summary-assembly.jar org.sindice.summary.Pipeline
--type VIRTUOSO
--repository jdbc:virtuoso://<hostname>:1111
--user UID
--pass PWD
--domain http://www.acme.org
--outputfile /path/to/<data>-summary.nt.gz
In the repository URL, 1111
is the Virtuoso Server port. You can find yours in the virtuoso.ini
file under the [Parameters]
section. UID is a valid username to connect to the Virtuoso instance at <hostname>
, and PWD the associated password.
Since the DGS computation is computationally demanding, you may want to increase the timeout parameter of Virtuoso by changing the following parameter in the virtuoso.ini
file:
MaxQueryExecutionTime=X
where the X value is the time in seconds. The Virtuoso server needs then to be restarted after changing that file.
sudo service tomcat6 stop
Set the sindice.home property with the path to a folder where the webapp is able to write configurations files and log outputs.
Edit the file /etc/default/tomcat6 and update the JAVA_OPTS property
JAVA_OPTS="-Dsindice.home=/path/to/sindice/home"
Set tomcat6 permissions on that folder,so that the deployed webapp can write in the sindice.home folder
sudo chown -R tomcat6:tomcat6 /path/to/sindice/home
Edit the default XML config file of SPARQLed recommendation-servlet/src/main/resources/default-config.xml
.
You have to set the URL to the SPARQL enpoint(s) where your data and the DGS graph sit. Here we assume that you have everything inside one endpoint, which is moreover accessible through HTTP.
Information relative to your data is set under the proxy tab. Information regarding the DGS is set under the recommender tag.
<backend>HTTP</backend>
<backendArgs>http://path/to/sparql/endpoint</backendArgs>
You can use other types of endpoint such as the Sesame Native repository. To do so, set the backend as NATIVE, and set the backendArgs to the Native repository path.
The sparqled.war file is created using the following command, and is located in the directory recommendation-servlet/target/:
mvn --projects recommendation-servlet --also-make package
With the SPARQL endpoint ready and the configuration files correctly setup, we are now ready to launch the webapp!
Remove a previously deployed SPARQLed webapp, in order to avoid unexpected issues:
rm -rf /path/to/sindice/home/sparqled
rm -rf $CATALINA_BASE/webapps/sparqled*
Copy the sparqled.war file to the tomcat base directory and start tomcat:
sudo cp recommendation-servlet/target/sparqled.war $CATALINA_BASE/webapps/
sudo service tomcat6 start
If no problem, the SPARQLed webapp is available at:
http://localhost:8080/sparqled/
You can monitor the webapp by looking at these logs:
tail -f $CATALINA_BASE/logs/catalina.out
tail -f /path/to/sindice/home/log/sparqled/sparqled.log
The page Capabilities of SPARQLed contains a full overview of what you can do.