Squirrel searches and collects Linked Data
$ make build dockerize
$ docker-compose build
$ docker-compose up
mvn clean package shade:shade -U -DskipTests
- if you have a new version of squirrel, e.g. version 0.3.0, you can execute
mvn install:install-file -DgroupId=org.aksw.simba -DartifactId=squirrel -Dpackaging=jar -Dversion=0.3.0 -Dfile="target\original-squirrel.jar" -DgeneratePom=true -DlocalRepositoryPath=repository
- If you want to use the Web-Components, have a look to the Dependencies in this file
docker build -t squirrel .
- execute a
.yml
file withdocker-compose -f <file> up
/down
All yml files in the root folder crawls real existing data portals with the help of HtmlScraper
docker-compose.yml
: file-sink based, without webdocker-compose-sparql.yml
: sparql-sink based (JENA), without webdocker-compose-sparql-web.yml
: sparql-sink based (JENA), with web including the visualization of crawled graph!
You can use a sparql-based triple store to store the crawled data. If you want use it, you have to do the following:
Until yet, the necessary datasets in the database are not created automatically. So you have to create them by hand:
- Run Squirrel as explained above
- Enter localhost:3030 in your browser's address line
- Go to manage datasets
- Click add new dataset
- For Dataset name paste contentset
- For Dataset type select Persistent – dataset will persist across Fuseki restarts
- Go to step 4 again and do the same, but this time with "Metadata" as "Dataset name"
The Squirrel-Webservice and the SquirrelWebObject are included in this project, now. This leads to the fact, that this project is a multi module maven project now. For that, there are 2 pom.xml's in the root layer:
pom.xml
: this is the module bundle pom xml. If you executemvn clean package
, this file will be called. As a consequence from this, all submodules including the squirrel will be complied an packedsquirrel-pom.xml
: the pom for the squirrel
If you want to run the squirrel with the Webservice, take care that you have already the current Webservice image (Docker). If not, execute
mvn clean package
(only necessary if you want to compile each subproject (module) for itself)- (
SquirrelWebObject\install.bat
) SquirrelWebService\buildImage.bat