Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T2: Export of controlled vocabulary restrictions #228

Open
rousso opened this issue Oct 11, 2024 · 1 comment
Open

T2: Export of controlled vocabulary restrictions #228

rousso opened this issue Oct 11, 2024 · 1 comment
Milestone

Comments

@rousso
Copy link
Contributor

rousso commented Oct 11, 2024

This task aims to enforce the use of controlled vocabularies within the Semantic Data Specification, ensuring that all enumerations and associated classes in the Conceptual Model are accurately represented in OWL and SHACL shapes, thereby maintaining data consistency and integrity.
The implementer is expected to:

  • Identify suitable representations for vocabulary restrictions: Define how these vocabulary restrictions are modelled in both SHACL and OWL. Provide examples of alternatives, such as sh:in to declare allowed values or sh:inScheme for a SKOS scheme.
  • Implement export restrictions on vocabularies: Implement the export of controlled vocabulary restrictions into the OWL and SHACL artefacts using the defined representations.
  • Demonstrate restrictions on vocabularies in practice: Create a Jupyter notebook that shows how the vocabulary restrictions are used while validating data or proposing values for a property.
  • Document controlled vocabularies validation strategies: Develop and provide of examples of RDF validation approaches that enforce controlled vocabulary restrictions. The documentation should consider edge cases, such as deprecated vocabulary terms, SKOS statuses (i.e euvoc:status other than at:CURRENT), hierarchical term relations (broader/narrower terms), discussing the advantages and disadvantages of each approach in the context of ePO and the use of Named Authority Lists (NAL).

Note

This ticket replaces model2owl#192

@rousso rousso added this to the 3.0.0 milestone Oct 13, 2024
@rousso rousso changed the title Export of controlled vocabulary restrictions T3: Export of controlled vocabulary restrictions Oct 13, 2024
@cristianvasquez cristianvasquez changed the title T3: Export of controlled vocabulary restrictions T2: Export of controlled vocabulary restrictions Oct 22, 2024
@gkostkowski
Copy link
Collaborator

gkostkowski commented Dec 20, 2024

The below description presents the solution design for controlled vocabularies (CV) that was discussed during an early feedback meeting. The core, restrictions and SHACL shapes parts were approved during the early feedback meeting. The CV constraint level is a new part that addresses the requirement to set SHACL shapes restrictiveness per enum. The solution includes CV representation for UML (input) and RDF (model2owl artefacts), scope and model2owl configuration.

Representation of a CV in the CM

The chosen way to represent a controlled vocabulary (CV) in a UML diagram (conceptual model) is by using enumeration (uml:Enumeration).
Two cases could be considered for a CV representation:

  1. referencing a CV: the conceptual model refers to an element representing a CV defined externally and identified by its URI.
    image
  2. defining a CV: the conceptual model defines a CV that is used in the model.
    image

The second case (definition) is already supported by model2owl. No changes or improvements are planned for that functionality as no relevant use case is foreseen. The functionality will be disabled by default but can be enabled if needed using proper configuration parameters.

Scope

Affected artefacts:

  • SHACL shapes
  • OWL restrictions

Nothing will be generated for OWL core as any referenced CV is expected to be defined externally.
A generation of axiom specifying skos:Concept as a value for rdfs:range for any object property associated to a CV (enum) is foreseen in OWL restrictions artefact. It is already supported by model2owl and the existing implementation of the corresponding transformation rule will be retained.

SHACL shapes

The designed SHACL shapes affect properties referring CVs in the model. Two alternative restriction modes will be supported: permissive and restrictive. The modes differ in degree of constraint.

Permissive

Permissive shapes enforces minimal restrictions for a CV and are suitable for cases when a CV is referenced in the model and its external definition cannot be used during validation process to inspect it's values (e.g. because no such RDF specification exists).

:permissive-shape a sh:PropertyShape ;
    sh:path :property;
    sh:class skos:Concept .

Restrictive

The restrictive shapes whether a value used in data is associated with the CV referred in the model. Thus it requires the CV definition (SKOS) during validation process.

:restrictive-shape a sh:PropertyShape ;
    sh:path :property ;
    sh:node [
        a sh:NodeShape ;
        sh:property [
            sh:path skos:inScheme ;
            sh:hasValue ns1:cv1 ;
        ] ;
    ] .

where ns1:cv1 refers to the controlled vocabulary designed in the model as an enumeration.
It is assumed that a controlled vocabulary is defined using SKOS vocabulary. The below snippet shows a hypothetical, minimal definition of the ns1:cv1 that is compliant with the above shape:

@prefix ns1: <http://example.com/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .

ns1:concept1 a skos:Concept ;
    skos:inScheme ns1:cv1 .

ns1:cv1 a skos:ConceptScheme .

It is assumed that such an external formal definition exists. However, it is not required by model2owl.

CV constraint level

Due to diversity of possible CVs that a project can use, there is a need for a fine-grained control of what kind of SHACL shapes should be generated for every CV. The recommended way is to include that information as a part of the model itself (and not as model2owl configuration).
In order to make it possible, the constraint level can be defined for every UML enumeration object used in the model. The property should be defined as a tag with a key set by a user in the configuration (cvConstraintLevel property).
There are two possible constraint levels that affect what kind of SHACL shapes will be generated for an enumeration:

  • permissive: permissive SHACL shape will be generated for the related UML enumeration
  • restrictive: restrictive SHACL shape will be generated for the related UML enumeration

model2owl configuration

One configuration parameter defined in the config-parameters.xsl is foreseen, that is cvConstraintLevelProperty:

  • a compact URI representing a property for a constraint level to be used as the key in a UML enumeration tag. Such a tag will be interpreted by model2owl and its value will determine kind of a SHACL shape that will be generated for the described enumeration, that is:
    • permissive: permissive SHACL shape will be generated for the related UML enumeration
    • restrictive: restrictive SHACL shape will be generated for the related UML enumeration
  • XML representation: <xsl:variable name="cvConstraintLevelProperty" select="'<property>'"/> where <property> is a compact URI (e.g. epo:constraintLevel)

A tag with a key specified in the cvConstraintLevelProperty is not materialized in model2owl artefacts. It is used only to control what SHACL shapes should be generated.

Example

UML (input)

2024-12-20_14-47

2024-12-20_14-48

config-parameters.xsl (main configuration file)

<xsl:variable name="cvConstraintLevelProperty" select="'epo:constraintLevel'"/>

namespaces.xml (configuration)

<?xml version="1.0" encoding="UTF-8"?>
<prefixes xmlns="http://publications.europa.eu/ns/">
    <prefix name="" value="http://data.europa.eu/a4g/ontology#"/>
    ...
    <prefix name="at-voc" value="http://publications.europa.eu/resource/authority/"/>
</prefixes>

Generated SHACL shapes (output)

@prefix : <http://data.europa.eu/a4g/ontology#> .
@prefix core-shape: <http://data.europa.eu/a4g/data-shape/core/> .

core-shape:epo-ContractTerm-epo-hasReservedExecution a sh:PropertyShape ;
    rdfs:isDefinedBy <http://data.europa.eu/a4g/data-shape/core/> ;
    sh:name "Has reserved execution" ;
    sh:path :hasReservedExecution ;
    sh:class skos:Concept .

core-shape:epo-ContractTerm-epo-hasEInvoicingPermission a sh:PropertyShape ;
    rdfs:isDefinedBy <http://data.europa.eu/a4g/data-shape/core/> ;
    sh:name "Has eInvoicing permission" ;
    sh:path :hasEInvoicingPermission ;
    sh:node [
        a sh:NodeShape ;
        sh:property [
            sh:path skos:inScheme ;
            sh:hasValue at-voc:permission ;
        ] ;
    ] .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants