Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triplestsore indexer #28

Merged
merged 9 commits into from
Feb 9, 2017
Merged

Conversation

dannylamb
Copy link
Contributor

@dannylamb dannylamb commented Feb 1, 2017

Islandora/documentation#394

This adds a small camel connector to index RDF from CLAW into a triplestore.

To Test

Assuming you're starting from scratch here (it's easier than getting lost in Karaf)

  • Pull in Install changes for triplestore indexer islandora-deprecated/claw_vagrant#8 into your claw_vagrant repo.
    • $ cd /path/to/claw_vagrant
    • $ git pull https://github.com/dannylamb/claw_vagrant triplestore-indexer
  • Update alpaca.sh to say git clone -b triplestsore-indexer https://github.com/dannylamb/Alpaca.git instead of cloning from origin/master. Be mindful of the spelling on that branch name. I derped pushing it up to Github
  • vagrant up
  • When it's finished, uninstall the islandora module (you will have to disable islandora_collection too)
  • Log into the box with vagrant ssh
  • Pull in the changes from Don't re-add drupal to document root on a re-provisioning documentation#34, which will enable GET requests on FedoraResource entities.
    • $ cd /var/www/html/drupal/web/modules/contrib/islandora
    • $ git pull https://github.com/dannylamb/islandora.git triplestore-indexer
  • Then re-install islandora and islandora_collection modules.

Once installed:

  • Visit http://localhost:8000/admin/content/fedora_resource and create a new Fedora resource
    screenshot from 2017-02-01 14-50-42
  • Visit http://localhost:8080/bigdata/#query and issue a select ?s ?p ?o where { ?s ?p ?o } query to get all triples. You should see the triples from the rdf mapping for the bundle you created.
    screenshot from 2017-02-01 14-51-02
  • Visit your resource in Drupal and update it
    screenshot from 2017-02-01 14-51-37
  • Go back to the triplestore and run the query again. At a minimum, modified dates and vclock should have updated.
    screenshot from 2017-02-01 14-51-53
  • Visit your resource in Drupal again, and delete it.
    screenshot from 2017-02-01 14-52-12
  • Back in the triplestore, the triples should be gone and the query will return no results.
    screenshot from 2017-02-01 14-52-30

Copy link

@DiegoPino DiegoPino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dannylamb
Copy link
Contributor Author

@DiegoPino I have updated the install scripts to rely on the features.xml provided by Alpaca. We no longer have to add the feature repos for camel and activemq by hand. islandora-deprecated/claw_vagrant#8

@dannylamb
Copy link
Contributor Author

@Islandora-CLAW/committers @Natkeeran This is ready for review, with install instructions. These changes cover three PRs, so if something's not clear and you're having problems with installing it, tag me.

@whikloj
Copy link
Member

whikloj commented Feb 7, 2017

I'm doing it the hard way, I'm building Alpaca inside and replacing the islandora-indexer-triplestore feature. We'll see if I can do it without getting lost.

@whikloj
Copy link
Member

whikloj commented Feb 8, 2017

Ok that didn't do anything, I'll try by the instructions here.

@dannylamb
Copy link
Contributor Author

dannylamb commented Feb 8, 2017

For the record, you'd have to

Basically, you're resetting karaf, getting rid of any old config, and then re-installing the recompiled Alpaca. Afterwards, you gotta re-touch up the config.

Copy link

@DiegoPino DiegoPino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

property placeholders in blueprint should load their values from the provided islandora.alpaca.indexing.triplestore.cfg file. Does not seem to be a good idea to have settings/URL hardcoded in blueprint properties if we are already providing a .cfg file for this.

@@ -8,18 +8,23 @@
http://camel.apache.org/schema/blueprint http://camel.apache.org/schema/blueprint/camel-blueprint.xsd">

<!-- OSGI blueprint property placeholder -->
<cm:property-placeholder id="properties" persistent-id="ca.islandora.indexing.triplestore" update-strategy="reload" />
<cm:property-placeholder id="properties" persistent-id="ca.islandora.alpaca.indexing.triplestore" update-strategy="reload" >

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have already an islandora.alpaca.indexing.triplestore.cfg file that defines these properties' values, we should load them from here instead of hardcoding. See http://camel.apache.org/properties.html#Properties-Usinga.cfgor.propertiesFileForBlueprintPropertyPlaceholders

Copy link
Contributor Author

@dannylamb dannylamb Feb 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DiegoPino The properties placeholder element in blueprint.xml is what loads the config file, and the 'hardcoded' values you're referring to are the defaults in case someone modifies a config file and removes an entry. It also is what bridges the gap and allows me to refer to values from config within routes, and not just in the blueprint file.

http://camel.apache.org/schema/blueprint http://camel.apache.org/schema/blueprint/camel-blueprint.xsd">

<!-- OSGI blueprint property placeholder -->
<cm:property-placeholder id="properties" persistent-id="ca.islandora.indexing.triplestore" update-strategy="reload" >

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@whikloj
Copy link
Member

whikloj commented Feb 8, 2017

@dannylamb Trying a vagrant up with alpaca.sh checking out your branch threw an error.

==> default: FAILURE: Build failed with an exception.
==> default: * What went wrong:
==> default: Execution failed for task ':islandora-indexing-triplestore:install'.
==> default: > Could not publish configuration 'archives'
==> default:    > 
==> default: Cannot publish artifact 'ca.islandora.indexing.triplestore-configuration.cfg (ca.islandora.alpaca:islandora-indexing-triplestore:0.1.1-SNAPSHOT)' (/home/ubuntu/Alpaca/islandora-indexing-triplestore/build/cfg/main/ca.islandora.indexing.triplestore.cfg) as it does not exist.
==> default: 
==> default: 
==> default: * Try:
==> default: Run with --stacktrace option to get the stack trace. Run with --info or --debug
==> default:  option to get more log output.

@dannylamb
Copy link
Contributor Author

@whikloj ok, looks like that config file needs a .alpaca. in the middle of it. Moment.

@DiegoPino
Copy link

@dannylamb ok, I got the same issue as @whikloj . Naming for id of placeholder was not the same on both and differed from the actual config file, so I kinda extrapolated to whole stuff (wrongly). If you fix that I'm 👍

@dannylamb
Copy link
Contributor Author

@whikloj @DiegoPino My last test properly deployed the config file, and this PR addresses that here. Is your Alpaca on your vagrant box cloned from my branch and repo?

Update alpaca.sh to say git clone -b triplestsore-indexer https://github.com/dannylamb/Alpaca.git instead of cloning from origin/master.

@DiegoPino
Copy link

DiegoPino commented Feb 8, 2017

@dannylamb yeah, that is the correct file:

<configfile finalname="/etc/ca.islandora.alpaca.indexing.triplestore.cfg">mvn:ca.islandora.alpaca/islandora-indexing-triplestore/${project.version}/cfg/configuration</configfile>

But the tests are referencing at
https://github.com/Islandora-CLAW/Alpaca/pull/28/files/2a851f9773e789d287bdeb219889e8204c4e8770#diff-6c7fc65ffa4096d22a789cff61e01f56R11,

<cm:property-placeholder id="properties" persistent-id="ca.islandora.indexing.triplestore" update-strategy="reload" >

I can be getting this all wrong of course so please excuse me if so.

@whikloj
Copy link
Member

whikloj commented Feb 8, 2017

Yeah I also see the correct file path in the karaf features.xml.

@whikloj
Copy link
Member

whikloj commented Feb 8, 2017

@ruebot
Copy link
Member

ruebot commented Feb 8, 2017

vagrant up

@whikloj
Copy link
Member

whikloj commented Feb 8, 2017

Still doesn't work, removed and tried reloading inside vagrant. Still didn't work.

@dannylamb
Copy link
Contributor Author

@whikloj I haven't pushed the changes up yet. Give me a minute.

@dannylamb
Copy link
Contributor Author

I'm doing another fresh vagrant up on my end to verify before setting folks loose on this again.

@dannylamb
Copy link
Contributor Author

And another one. vagrant uping again.

@ruebot
Copy link
Member

ruebot commented Feb 8, 2017

@dannylamb build came up clean for me now. I'll proceed with testing further.

@Natkeeran
Copy link
Contributor

Natkeeran commented Feb 8, 2017

@dannylamb I'll give it a try as well.

Followed the steps. It does not index it for me! I cannot see any triples related to the resource that was created.

@ruebot
Copy link
Member

ruebot commented Feb 8, 2017

All test procedures work for me. I'm comfortable merging.

@whikloj
Copy link
Member

whikloj commented Feb 9, 2017

I see objects created, I see JSON-LD REST responses, but nothing in the triplestore.

Karaf says

2017-02-09 15:14:52,807 | ERROR | ing-triplestore] | TriplestoreIndexer               | 187 - ca.islandora.alpaca.islandora-indexing-triplestore - 0.1.1.SNAPSHOT | Error indexing http://localhost:8000/fedora_resource/1 in triplestore: HTTP operation failed invoking http://localhost:8000/fedora_resource/1?_format=jsonld with statusCode: 403
org.apache.camel.http.common.HttpOperationFailedException: HTTP operation failed invoking http://localhost:8000/fedora_resource/1?_format=jsonld with statusCode: 403        at org.apache.camel.component.http4.HttpProducer.populateHttpOperationFailedException(HttpProducer.java:284)
        at org.apache.camel.component.http4.HttpProducer.process(HttpProducer.java:192)
...

Do I need to make REST requests available to all users?

@whikloj
Copy link
Member

whikloj commented Feb 9, 2017

Oh wait the configuration changes did not take effect. I might have this.

@whikloj
Copy link
Member

whikloj commented Feb 9, 2017

Ok using the instructions this PR, islandora-deprecated/claw_vagrant#8 and Islandora/documentation#34 all work.

I'm merging all 3. 👍

@whikloj whikloj merged commit 6c38bc4 into Islandora:master Feb 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants