-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semantics of the dcat:bbox attribute could (should?) be more explicit #1392
Comments
@JoepvanGenuchten , thanks for raising this issue. I think there are two separate aspects here: one is about the range, and one about the actual semantics of this property.
|
@andrea-perego is correct - semantics is relative to the domain the property is used for. There needs to be some out-of-band (i.e. not in DCAT model) statement about how any domain intends to use any generic property - so example how to interpret the concept of a "box" is probably domain specific. another example is dcterms:conformsTo - in DCAT it relates to the dataset - whereas in most other usages it seems to relate to the information object itself, not the object it is describing. NB the semantics of "conformance" itself caused much debate - and was eventually delegated to the domain to interpret. |
@rob-metalinkage @andrea-perego thank you for your responses. This clarifies a a lot, that also helps me refine my own comment here. tldr:
in more detail (and reverse order ;-)): 1.a) I see that the argument here is that the exact meaning can be derived from the domain and the range. Fair enough, that is indeed one of the strengths of owl. I am still worried about possible misinterpretations, and i think a more elaborate definition could go a long way ( right now it says "The geographic bounding box of a resource." see also next point) 1.b) About the naming of the dcat:bbox attribute (and upon further reading, the same can be said for dcat:centroid): When making a semantic model, we aim to give a clear name (label, uri) to any rdf:property (be it a datatype property or an object property) that references the meaning or significance of the relationship. To take an example of where this is obviously done right: In rdf schema there are multiple relationships between rdf:Property and rdfs:Resource. We have rdfs:domain and rdfs:range. We untuitively understand that just because we have a relationship that points from something of type Property to something of type Resource, that we can just call this relationship rdfs:resource. We give the propertie(s) clear names (and uri's) that tell us something about what we mean by them. But I feel here we make that exact mistake. By defining dcat:bbox (even if you say its a literal because it can be any technical serialization of a bounding box), we are basically saying (or at the very least implying): there is only 1 semantically meaningful relationship between dcat:Resource and any bounding box representation, and we are not very clear about what we mean by it. Alternative names for dcat:bbox (that I think say more about what we want know, or emphasize what we intend) could be "dcat:occupiesPhysicalSpace" or something like that. This also leaves the option to use another way of representing this information. requiring it to be a (certain kind of) bbox, in my opinion, belongs to the realm of shacl (see also next point). 2.a) Taking a step back: why does dcat concern itself with bounding boxes (and centroids) in the first place? Does the rather technical object definition of bbox really add, functionally speaking, to what dcat is trying to achieve in terms of how we communicate about our data catalogs and their resources? There are whole taxonomies of how to model geometries and shapes, some much more accurate than a bounding box (why would a swept solid represent this information any less accurately?), or , some less, why pin it down on this one if what you really want to know is where the resource is physically located? 2.b) Is this relationship really different from geosparql hasGeometry? or a similar relationship in the Industry Foundation Classes? if so, why does this vocabulary have such unique requirements that it should define its own property for it, can't we just conform to the models that domain experts have made for shapes and geometries? If not: why not explicitly use one of those? Hope this helps! |
I think the GeoSPARQL group are possibly looking providing options for describing semantics of geometry properties. These should perhaps just be used via a GeoDCAT profile. Any real world spatial object has multiple possible geometric representations - and these may vary according to other aspects of its state. A dataset covering the spatial domain of some object would potential share these. |
About "why does dcat concern itself with bounding boxes (and centroids) in the first place?", you can find the background discussion in #83 , which also links to the relevant use case in DXWG UCR document. Trying to summarise it: The original version of DCAT did not provide guidance on how to specify the spatial coverage of a dataset by using geometries. Implementation experiences shew that this gap was raising interoperability issues, and therefore DCAT2 addresses it by supporting specific properties for the most typical cases - i.e., geometries, bounding boxes, and centroids. The reason why specific properties for bounding boxes and centroids have been defined in the DCAT namespace was that there was no standard way of doing this - i.e., commonly used vocabularies, as the W3C Basic Geo and GeoSPARQL, don't have such properties (something that is instead being addressed in the new version of GeoSPARQL under development), with the exception of Schema.org ( About your note at point 1.b:
Any suggestion on how to improve the description of a:Resource a dcat:Resource ;
:occupiesPhysicalSpace [ a dcterms:Location ;
dcat:bbox "POLYGON((...))"^^geosparql:wktLiteral
] . I am not sure I answered to all your points, so please let me know whatever I missed. Also, it would be very useful if you could complement the issues you are highlighting with specific use cases and examples, so to better understand the possible weaknesses of the current DCAT approach in addressing your requirements. |
@andrea-perego thank you for the links to the other discussions. About "why its in here": I see a lot of clarifying arguments, especially in issue 83. I agree with the formal arguments about why you might want a separate object property to describe this particular usecase. But I maintain it should be handled by a geospatially oriented vocabulary and not by one aimed at data cataloging. I would say it makes the catalogging vocabulary cluttered/bulky/bloated with concepts that are/should be handled in different domains (personal context note: I have been working with the IEC-CM which is a fantastic but massive reference model for the electric utilities. Its very size stands in the way of adoption by the shear intimidation new-comers feel when trying to work with it, so keeping reference models small and 'digestable' is one of the things I try to aim for in my work) . Having said that: i get the sense that decision about this issue has been made and I might disagree with the outcome, but perhaps I should lay that to rest. Upon more detailed inspection of the ontology (admittedly, I had only looked at the TR until now) , I realize the domain of these properties is dcterms:Location and not Resource which is what the text suggests (this makes it even more tempting to hammer on the previous point, but I wont ;-)). Given this, I would propose the following: for dcat:bbox: "Represents the physical space a dcterms:Location occupies" I appreciate the time you guys take to work through this. |
This is a general issue on modularity and reuse. I agree that if there would be like dcterms a very generic geo vocabulary defining the bounding box of a resource (geo:bbox) then this property could be reused in this usage context. However it does not. (At least to my knowledge). To satisfy the usecase to express bounding box information about the spatial coverage of a dataset a new property has to be created with a URI in the DCAT domain. The issue comes when other people would like to reuse this property outside the scope of DCAT. Then the story becomes difficult. Because then it looks as if DCAT has defined the domain neutral universal property geo:bbox, while DCAT is, as you mentioned, a scoped vocabulary about cataloging resources. From the DCAT vocabulary perspective one cannot avoid the cherry picking reuse beyond the DCAT scope. But I agree DCAT should not become the upper ontology for the semantic web, just because this one is active. In the first place the semantics given to dcat:bbox should support the DCAT usecases. In that case I am happy with the current definition, and do not feel the need to add "physical" to it. It, though, might be improved w.r.t. 'resource'. I admit that by reading the paragraph https://w3c.github.io/dxwg/dcat/#Property:location_bbox as such one could interpret resource as a cataloged resource and not as a rdfs:Resource. Because the domain is way up, and only visible after scrolling. This might be (one of) the source(s) for filing the issue. For other properties in the same situation: e.g. https://w3c.github.io/dxwg/dcat/#Property:checksum_algorithm the domain has been added. |
@bertvannuffelen I can accept not adding "physical", but would push on not using the term 'resource' in the definition and replacing it with (dcterms:)Location, beyond issues about documentation rendering. After all: by setting the domain of the property to be dcterms:Location, we say this is a property of the location, and not of the resource (dcat or rdfs) that happens to find itself there. |
@bertvannuffelen is this what you are looking for? |
@dr-shorthair yes this seems close to what I meant. a) domain So now the question rises if one would like to reuse it, is dct:Location a geo:SpatialObject? If that is not the case then reuse creates discussions. Do you know the answer? b) range Although the intend of the property is very similar, the chosen modeling is probably not trivially compatible with eachother. And maybe after a more in depth investigation it are properties that cannot be merged. |
@bertvannuffelen I see no problem with a). On b) I think it works if |
@dr-shorthair said:
Not sure about this.
|
Trying to summarise the results of the discussion in this thread, and outlining possible actions:
About point (2), there's of course the issue of having two alternative ways to specify the same information, which does not help interoperability. However, this might be mitigated by defining mappings, as explained by @dr-shorthair . @JoepvanGenuchten , @bertvannuffelen , are you happy with this summary? Is there anything I left out? |
@andrea-perego I am happy with the summary. |
@andrea-perego good summary, I have nothing to add. |
Relevant issue: #1392 Replace "resource" with "spatial thing" (in the SDW-BP sense) in the definition of `locn:geometry`, `dcat:bbox`, and `dcat:centroid`.
@JoepvanGenuchten , @bertvannuffelen , The following point:
has been addressed via PR #1423 (now merged in the ED). If you have concerns on the adopted solution, please open a new issue. Moreover, a new issue (#1425) has been created for further discussion on the following point :
|
The http://www.w3.org/ns/dcat#bbox attribute is somewhat vague (or perhaps lends itself to misinterpretation) dcat describes resources (data-sets) and while I suspect that this attribute is intended to say something about what the resource is about, it could also be interpreted as the (physical) nature of the resource itself. A somewhat forced example: an old fashioned phone-book is about a region or a municipality but simultaneously a phonebook has a bounding-box (describing how thick it is for example and where it lies on my shelf).
In general, naming an attribute after its type (or range in rdfs terms) is not specific enough to be clear about its intent (although that might be a personal preference). While the the spec mentions the range is 'intentionally generic' the technical origin of this concept (CAD drawing, BIM etc, where this is a mathematical construct to represent a 3 dimensional object ) will most likely cause most people (and people programming machines) to ignore the 'intentionally generic' part of the spec.
My suggestion would either be to rename this attribute to make its intent more clear (dcat:subjectBoundingBox for example), or elaborate in the description on its intended meaning (phone book vs region the phonebook covers)
The text was updated successfully, but these errors were encountered: