-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SkohubProvider #29
Comments
It should be pretty easy to implement. I haven't figured out how to get a machine-readable form of the list of schemes (on the web at https://skohub.io/hbz/vocabs-edu/heads/master/, maybe @acka47 can help), but all the other data is available as JSON-LD. Examples:
The format is almost JSKOS already, with some differences:
If I understand correctly, this is not an API, but rather flat files. So we can't used things like |
I am not sure what a Cocoda provider actually is but I assume that this issue is (at least in part) about loading a complete SKOS scheme from SkoHub Vocabs. SkoHub Vocabs supports both hash and slash URIs and a vocabulary using hash URIs can be directly loaded from one URL, see as an example https://skohub.io/acka47/nwbib-spatial/heads/master/nwbib.de/spatial.json (this is only for testing, the canonical version of the concept scheme resides at https://nwbib.de/spatial). |
@acka47 My question was rather whether it is possible to get the list of all available schemes (for example https://skohub.io/hbz/vocabs-edu/heads/master/) in a machine-readable format / JSON-LD. |
@stefandesu This is currently not possible. I am not sure about your goal yet: Do you want to get all vocabularies for a git repo that is connected do SkoHub Vocabs (like https://skohub.io/hbz/vocabs-edu/heads/master/) or would you even like to have a list for the whole SkoHub Vocabs instance (here: https://skohub.io)? |
@acka47 Thanks for your reply! It seems like I hadn't fully grasped SkoHub Vocabs yet. So there are different SkoHub Vocabs instances (like https://skohub.io), each instance has a number of connected repos (like https://skohub.io/hbz/vocabs-edu/heads/master/ or https://skohub.io/acka47/nwbib-spatial/heads/master/), and each of those repos can have contain multiple vocabularies. In this case, I think getting all possible vocabularies for all repos in a SkoHub Vocabs instance would be overkill. I think for the beginning it should be sufficient if a list of vocabularies has to be provided as well when configuring access to SkoHub Vocabs. 👍 We already have a configuration property for this and this is also necessary for Skosmos access, for example. |
Right now, afaik, there are two instances running on a server: skohub.io and https://vocabs.openeduhub.de/ but there also is the option to easily run SkoHub Vocabs with a single connected repo using Docker and GitHub pages (see 3 & 4 in our SWIB20 workshop) so that anybody with a GitHub account can easily set up an instance.
Right
Does this mean that we don't need to add anything to SkoHub Vocabs for now? Anyway, I've opened an issue for adding structured data to the vocabulary list for a repo: skohub-io/skohub-vocabs#110 What kind of vocabulary list do you require for configuring access in Cocoda? Can you point me to a spec? |
Yes, no need to add anything for now, and thanks!
So this repo offers uniform access (= comparable methods and data in JSKOS format) to different sources of data. For example, we have a wrapper for Skosmos so that we can access Skosmos instances. This particular issue is about adding a wrapper for SkoHub Vocabs. As soon as that wrapper is available, configuration could be as easy as a JSON object like this: {
"provider": "SkoHubVocabs",
"api": "https://skohub.io/hbz/vocabs-edu/heads/master/",
"schemes": [
{ "uri": "https://w3id.org/class/esc/scheme" }
]
} I guess if you're interested, you could look at the docs for I would also suggest to put this provider/wrapper into a separate module instead of including it with cocoda-sdk. |
If the vocabulary is reasonably small, we could also load the full JSON-LD file. Shohub-vocabs seems (I've not found the specific part of its code) to generate a https://skohub.io/dini-ag-kim/hcrt/heads/master/w3id.org/kim/hcrt/scheme.json for the vocabulary http://bartoc.org/en/node/20057 (I've added this URL as API in BARTOC). The JSON-LD is structured with a context document defined here. I'd prefer to not convert to RDF and back to JSON-LD but reuse the existing JSON, this requires skohub-vocabs to not change the context document in a way that's not backwards compatible. |
This is only the description of the scheme with a list of the topConcepts and their labels but there is a lot missing:
To get this information, you'll have to derefence each concept. Note that this only holds for SKOS vocabularies with Slash URIs. As one would suspect, you will find all the information for all concepts in one file if the vocab uses hash URIs, see e.g. https://skohub.io/acka47/nwbib-spatial/heads/master/nwbib.de/spatial.json (which is a fork for testing purposes of the NWBib spatial classification that uses hash URIs). |
Probably, we will use an external context at some point but its content will be backwards compatible. |
Ok, then we have vocabulary information and top concepts ( |
As you are referring to Actually, I think it should be considered best practice (although we haven't done it in the past for some vocabs) not to use |
Well then I'm confused
|
Sorry I did confuse you but I just wanted to make clear that not every
Dereference the schema URI with
Add a
Dereference the schema URI with |
Ok, a first draft of Skohub provider without search is at the skohub branch. |
Skohub integration is now ready to be tested in Cocoda Dev! I've included the following vocabularies so far:
Note that vocabularies that use hash URIs are not yet supported. |
To-Dos:
|
Thanks for moving this forward. I just want to let you know that @awagner-mainz has started working on a module that adds a reconcilation endpoint to a vocab published with SkoHub: https://github.com/mpilhlt/skohub-reconcile Furthermore, at hbz, we are in the process of hiring a SkoHub developer who will – amongst other things – help moving the reconciliation module forward in 2023/24. This basically means that Cocoda will be able to connect to SkoHub Vocabs via an API that is already supported by Cocoda (Reconcilation API), at least when people have configured their vocab to include the endpoint. Which will make things a lot easier I guess. |
As far as I can tell, I fixed all known issues and everything should work as expected now. (For the three listed vocabularies above.) I've also added everything that's needed to include them via BARTOC (tested only locally so far, but I don't see a reason why it shouldn't work with our main instance). The only thing that is missing is support for vocabularies that use hash URIs. As far as I can see, that's going to be a very different implementation from what we have now. (It's probably a simple implementation, but we might as well create a whole separate provider for it instead of having two totally different code paths in our one provider. 🤔) @nichtich: Can we postpone hash URI support? @acka47: Are there many vocabularies that use hash URIs? Do you have other ones apart from https://nwbib.de/spatial? |
Works great, thanks! I've added Skohub URIs of kdsf-ffk, esc, and isced-2013 to BARTOC so support of the vocabularies will be configured there. As far as I understand the Shohub "API" base URI is equal to the vocabulary URI, right?
yes. Given the issue with hash-URI and the upcoming development of ShoHub I'd mark support of Skohub in cocoda-sdk as "experimental". The search functionality works but it is more of a hack as it plugs into an internal implementation of Skohub. A stable reconciliation API would be better in the long term. |
Yes, I will document that somewhere. This is required because we're de-referencing the URIs, so we need to know which of the URIs is the Skohub vocabulary URI. |
Currently not, but only some legacy vocabs. Re. nwbib-spatial, there also exists an always up-to-date turtle file with the whole vocab at https://nwbib.de/spatial.ttl. In case indexing a SKOS file is a better way to integrate a vocab... |
Picking up on this as I recently got separate requests from two people (@bokahama & @timtomch) who would like to use Cocoda on SKOS Vocabs published with SkoHub. Could they use the "experimental" version for this? With regard to a stable reconciliation API, @sroertgen is currently working on it, see skohub-io/skohub-reconcile#11, but I think it will be a few weeks before this can be used in production. We will discuss whether we could use one of these as first use cases for the endpoint. |
Sure! I mean it is already available in the release version of Cocoda, so there's nothing different about this compared to "stable" providers, only that there might still be bugs or missing features (in particular missing support for hash URIs). Some more notes here: https://gbv.github.io/cocoda-sdk/SkohubProvider.html @bokahama and @timtomch, please let me know here if you are running into issues. 👍 |
Skohub is a static site generator for SKOS vocabularies. It's result can be browsed in machine-readable form (see https://blog.lobid.org/2019/09/27/presenting-skohub-vocabs.html) and could be wrapped as Provider.
The text was updated successfully, but these errors were encountered: