forked from ontop/ontop
-
Notifications
You must be signed in to change notification settings - Fork 0
Lubm3xData
skomlaebri edited this page Nov 29, 2013
·
5 revisions
This pages describes the way that we loaded the data generated by LUBM into Quest
At the moment Quest uses the OWLAPi to load TBox and ABoxes. This is very inefficient for large ABoxes. We need a lighter mechanism where little parsing is done and where streaming of triples is possible.
Solution:
- Generate all LUBM data files.
- Transform and merge all the data in a simple triple file (e.g., N-Triple)
- Create a new ABox assertion streamer that read the file line by line with very simple parsing.
This is done using the traditional LUBM data generator tool using the command:
java -cp classes/ edu.lehigh.swat.bench.uba.Generator -univ 1000 -onto http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl
Transforming and merging (Source)
To do this we will use Jena, in particular the command rdfcat.
- Setting up Jena. Download Jena and setup your environment as follows:
- Add the following to your .bashrc file.
-
export JENAROOT=~/Documents/OBDA/related_software/Jena-2.6.4 export PATH=$JENAROOT/bin:$PATH
- Execute the command
-
chmod u+x $JENAROOT/bin/*
find . -type f -name "University*.owl" -exec rdfcat -out N-TRIPLE -x {} >> University0-99.nt <br>;
We also need to remove imports and other non-data triples with the commands:
cat University0-99.nt | grep -v http://www.w3.org/2002/07/owl#Ont > University0-99-clean.nt cat University0-99-clean.nt | grep -v http://www.w3.org/2002/07/owl#imports > University0-99-clean2.nt
To merge each university into a single nt file we used the following bash script:
#sh #!/bin/bash echo "Generating nt files" for i in {0..99} do echo "Doing uni $i" find . -type f -name "University$i_*.owl" -exec rdfcat -out N-TRIPLE -x {} >> uni$i.nt <br>; done
To clean all files we did
#sh #!/bin/bash echo "Cleaning nt files" for i in {0..99} do echo "Doing uni $i" cat university-data-$i.nt | grep -v http://www.w3.org/2002/07/owl#Ont | grep -v http://www.w3.org/2002/07/owl#imports > university-data-$i.nt.tmp rm university-data-$i.nt mv university-data-$i.nt.tmp university-data-$i.nt done
- Quick Start Guide
- Easy-Tutorials
- More Tutorials
- Examples
- FAQ
- Using Ontop
- Learning more
- Troubleshooting
- Developer Guides
- Links