-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SKOS lookup function in Fix #415
Comments
In today's meeting we decided to:
Further considerations:
|
As required, here is my use case. In RPB data we only have notations for RPB subject, e.g. I can create the correct concept URI with Fix, resulting in: {
"subject":[
{
"id":"http://purl.org/lobid/rpb#n584060",
"label":"Platzhalter Schlagwortlabel",
"type":[
"Concept"
],
"source":{
"id":"http://purl.org/lobid/rpb",
"label":"Systematik der Rheinland-Pfälzischen Bibliographie"
}
},
{
"id":"http://purl.org/lobid/rpb#n584070",
"label":"Platzhalter Schlagwortlabel",
"type":[
"Concept"
],
"source":{
"id":"http://purl.org/lobid/rpb",
"label":"Systematik der Rheinland-Pfälzischen Bibliographie"
}
}
]
} As you can see, for the label I added a generic "Platzhalter Schlagwortlabel" for now as I can not (yet) lookup labels in a SKOS file. I'd be happy to in the future do something like this in the fix:
Where I basically specify what content should be added to the new "label" field by indicating:
|
In general, I think we should implement this like the existing lookup, so something like:
|
Are you sure that |
My view would be that essentially, we want to support one additional file format, TTL, in addition to CSV and TSV. Since we'd probably implement this based on an RDF model anyway, we might as well support other SKOS RDF serializations (though I'm not even sure I like that idea, I'd prefer to stick to actual use cases, and we use TTL files). But a generic RDF lookup would be quite a different thing. For that, something like |
In principle, yes, but But I'm unsure myself. I just think we might regret it if we overwhelmed [Come to think of it, maybe we shouldn't even have added local maps to it. |
I agree to @blackwinter - while it's in principle possible to make a Map out of RDF files, it may get complicated. And since there are Further considerations it may be better to go with an RDF store from the beginning. |
I don't think it helps to talk about RDF here. Spreadsheets are also much more powerful than simple dictionary lookups, yet we don't have generic spreadsheet support, we only use TSV or CSV files as simple dictionaries. Same is our plan for SKOS as I understand it: we want to use it as a simple dictionary. |
Hm, but if you look at the scenarios @TobiasNx provided - these are not simple dictionaries? I mean, yeah, you can all things break somehow down to key-value structures, but they may not fit all purposes, e.g. "give me A, but A shall not have B and must be of Concept C". See also Semantic Reasoner. I mean, it's about |
No problem with RDF, and I even imagined to implement this based on an RDF model, using Jena. My point is how this will be used. I think it should provide a simple way to look up values in a SKOS-TTL instead of a TSV or CSV. It should not require dealing with RDF concepts. Something like Another option in my point of view would be to add support for reading RDF data in Metafacture. We could then write a small 'preprocessing' workflow that transforms the RDF data into a lookup TSV and use that, instead of adding lookup support for SKOS. |
I want to hint to one advantage of an genuine SKOS lookup we can use one In OERSI we have e.g.:
with some kind of SKOS-lookup this could be:
|
I agree with this statement as long as you say "one ConceptScheme" instead of "one ttl-file". As noted before, even with SkoHub Vocabs one Concept scheme can be spread over many files, which totally makes sense when you have a big vocab. |
I have updated the initial post so that the function are fix now: I also gave an idea of the function I had in mind:
|
Wouldn't it make sense to use And are language tags required in SKOS? Even if they were, I think having a default like 'If there is only one language, use that if no target language is given' would be nice. |
Languages tags in SKOS are optional:
source: https://www.w3.org/TR/2009/NOTE-skos-primer-20090818/#seclabel |
@sroertgen as you are here: I know you extensively use SKOS files for normalizing data in an ETL process. Are your use cases adressed in this issue or do you see something we should keep in mind? |
Yes, that is pretty much what we did in WLO. We used |
Works like fix function 'lookup', also using a Map. The Map is build dynamically querying an RDF model.
Implementation against further tests from metafacture/metafacture-core#415 (comment). - adapt some falsely Fix - reuse test file "hcrt.ttl" - one test tagged as "todo" because it needs introduction of new parameter - reformat hcrt.ttl
- enable integration test - add test See metafacture/metafacture-core#415.
- enable integration test - add test See metafacture/metafacture-core#415.
- enable integration test - add test See metafacture/metafacture-core#415.
@TobiasNx I've added an optional parameter select, which takes "subject" or "object" as value. See your lookupRdfDefinedPropertyToSubject/test.fix how to use it. |
@dr0i you can still not differentiate between different objects. right? |
What do you mean? Can you add a new scenario ? |
No, I did not look properly and I perhaps I do not understand the option. The usecase was the old
Incoming Element other scenario
Incoming Element |
In the diff if the last commit (metafacture/metafacture-fix@765c224) I gave an example:
|
Enabled Starting documenation here: mandatory: O) the
(getting S) the
P) another language version (
Following the optional parameters (a likely redundant explanation - it's already noted in the mandatory section): optional |
Works like fix function 'lookup', also using a Map. The Map is build dynamically querying an RDF model.
Implementation against further tests from metafacture/metafacture-core#415 (comment). - adapt some falsely Fix - reuse test file "hcrt.ttl" - one test tagged as "todo" because it needs introduction of new parameter - reformat hcrt.ttl
- enable integration test - add test See metafacture/metafacture-core#415.
In the Destatis-Fächerklassifikation Vocab there are now english prefLabels and in order to add them with metamorph/fix we need to use different mapping files for each language in order to get the prefLabels we want like https://gitlab.com/oersi/oersi-etl/-/blob/master/data/maps/subject-labels.tsv
For an english version we would need an additional list, that would need to be cared about.
But since we have a ScoHub Vocabs/Skos-‘ttl‘-files it would be nice to use them as lookup so that we do not need to create and update additional lists.
For the lookup should ttl file should be the target: e.g.: https://github.com/dini-ag-kim/hochschulfaechersystematik/blob/master/hochschulfaechersystematik.ttl
(Other skos serialization could follow)
Nice would be something like the following with mock code:
Idea for Fix function:
file=
could be a URL or a local file,match=
is default idmatch=
andmatchLanguage=
are optionaltarget=
andtargetLanguage=
are always neededUse case 1:
Find matching subject and return object of targeted predicate.
in:
https://w3id.org/kim/hochschulfaechersystematik/n4
skos_lookup("path", file="https://raw.githubusercontent.com/dini-ag-kim/hochschulfaechersystematik/master/hochschulfaechersystematik.ttl", target="prefLabel", targetLanguage="de")
out:
Mathematik, Naturwissenschaften
Use case 2:
Find matching object value in selected predicate and return its subject.
in:
Mathematics, Natural Sciences
skos_lookup("path", file="https://raw.githubusercontent.com/dini-ag-kim/hochschulfaechersystematik/master/hochschulfaechersystematik.ttl",match="prefLabel", matchLanguage="en", target="id")
out:
https://w3id.org/kim/hochschulfaechersystematik/n4
Use case 3:
Find matching object value in selected predicate and return object of targeted and connected predicate.
This could be also interesting if we have SKOS files with
hiddenLabels
oraltLabels
.in:
Mathematics, Natural Sciences
skos_lookup("path", file="https://raw.githubusercontent.com/dini-ag-kim/hochschulfaechersystematik/master/hochschulfaechersystematik.ttl", match="prefLabel", matchLanguage="en", target="prefLabel", targetLanguage="de")
out:
Mathematik, Naturwissenschaften
Code review: @fsteeg
Functional review: @TobiasNx @acka47
The text was updated successfully, but these errors were encountered: