-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal to solve issue #74 #78
base: main
Are you sure you want to change the base?
Changes from 3 commits
7ff4c5f
8d9c744
d609438
fb3f2e2
c2637c9
36dfc95
f443a4d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -43,7 +43,7 @@ has exactly one value for each of the following two properties: | |||||||||||
* a [child map]() (`rml:childMap`), | ||||||||||||
whose value is an [Expression Map]() (`rml:ExpressionMap`), which | ||||||||||||
MUST include references that exists in the [Logical Source]() | ||||||||||||
of the [Parent Triples Map]() that contains the [Referencing Object Map]() | ||||||||||||
of the [Triples Map]() that contains the [Referencing Object Map]() | ||||||||||||
or it should have a constant value. | ||||||||||||
|
||||||||||||
* a [parent map]() (`rml:parentMap`), | ||||||||||||
|
@@ -96,3 +96,77 @@ then the `rml:child` shortcut could be used. | |||||||||||
rml:logicalSource <LS2> ; | ||||||||||||
rml:subjectMap <#SM2> . | ||||||||||||
``` | ||||||||||||
## Join types | ||||||||||||
elsdvlee marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||
|
||||||||||||
If the [Logical Source]() of the [Triples Map]() that contains the [Referencing Object Map]() | ||||||||||||
and the [Logical Source]() of the [Referencing Object Map]()'s [Parent Triples Map]() are not identical, | ||||||||||||
then the referencing object map must have at least one join condition. | ||||||||||||
elsdvlee marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||
|
||||||||||||
A [Logical Source]() is considered as identical to another [Logical Source]() | ||||||||||||
elsdvlee marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||
when the set of objects at the end of the property paths starting with `rml:source` and starting with `rml:iterator` are identical. | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why property paths?
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pmaria I couldn't think of another way to specify that also the descriptions of nested sources should be equals (so nested source descriptions can have different identifiers, but still have the same values for all nested properties (e.g. and are equal in the example, even is they have different identifiers.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe infinite nesting is not something we want to support in a first go. Couldn't we just extend Pano's suggestion with something like
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. as discussed today:
this last point means that each source access description needs to list its actionable properties There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My last understanding of our discussion was that we were not going to go for fully isomorph sources, but for explicitly defining which parts of each source type should be isomorph. This would be my preference. |
||||||||||||
In below examples `<LS1>` and `<LS2>` are identical, but `<LS1>` and `<LS3>` are not identical. | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I'm reading this right, engines must check if |
||||||||||||
``` | ||||||||||||
<LS1> | ||||||||||||
a rml:LogicalSource; | ||||||||||||
rml:source <S1>; | ||||||||||||
rml:referenceFormulation rml:JSONPath; | ||||||||||||
rml:iterator "$.jsonpath.expression". | ||||||||||||
<S1> | ||||||||||||
a rml:Source, void:Dataset; | ||||||||||||
void:dataDump <file:///data/dump.nt>. | ||||||||||||
|
||||||||||||
<LS2> | ||||||||||||
a rml:LogicalSource; | ||||||||||||
rml:source <S2>; | ||||||||||||
rml:referenceFormulation rml:JSONPath; | ||||||||||||
rml:iterator "$.jsonpath.expression". | ||||||||||||
<S2> | ||||||||||||
a rml:Source, void:Dataset; | ||||||||||||
void:dataDump <file:///data/dump.nt>. | ||||||||||||
|
||||||||||||
<LS3> | ||||||||||||
a rml:LogicalSource; | ||||||||||||
rml:source <S1>; | ||||||||||||
rml:referenceFormulation rml:JSONPath; | ||||||||||||
rml:iterator "$.jsonpath.expression2". | ||||||||||||
``` | ||||||||||||
|
||||||||||||
``` | ||||||||||||
<LS1> | ||||||||||||
a rml:LogicalSource; | ||||||||||||
rml:source [ a rml:Source, a csvw:Table | ||||||||||||
csvw:url "/absolute/path/to/data.csv"; | ||||||||||||
]; | ||||||||||||
rml:referenceFormulation rml:CSV. | ||||||||||||
|
||||||||||||
<LS2> | ||||||||||||
a rml:LogicalSource; | ||||||||||||
rml:source <S2>; | ||||||||||||
rml:referenceFormulation rml:CSV. | ||||||||||||
|
||||||||||||
<S2> | ||||||||||||
a rml:Source, a csvw:Table | ||||||||||||
csvw:url "/absolute/path/to/data.csv". | ||||||||||||
|
||||||||||||
<LS3> | ||||||||||||
a rml:LogicalSource; | ||||||||||||
rml:source [ a rml:Source, a csvw:Table | ||||||||||||
csvw:url "/relative/path/to/data.csv"; | ||||||||||||
]. | ||||||||||||
rml:referenceFormulation rml:CSV. | ||||||||||||
``` | ||||||||||||
|
||||||||||||
If the [Referencing Object Map]() has no join condition | ||||||||||||
(which is only allowed when the [Logical Source]() of the [Triples Map]() that contains the [Referencing Object Map]() | ||||||||||||
elsdvlee marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||
and the [Logical Source]() of the [Referencing Object Map]()'s [Parent Triples Map]() are identical), a natural join is executed. | ||||||||||||
In reality this means that the [Logical Source]() is used in its original form when generating the related RDF triples. | ||||||||||||
elsdvlee marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||
|
||||||||||||
If the [Referencing Object Map]() has one or more join conditions, an inner join is executed. | ||||||||||||
elsdvlee marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||
The related RDF triples are generated using the [=n-ary Cartesian product=] | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we can we should refer to a general description of generating triples. so that we don't have to repeat here that we use the n-ary cartesion product. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pmaria Can you please make a suggestion? |
||||||||||||
of the logical iteration of the [Logical Source]() of the [Triples Map]() that contains the [Referencing Object Map]() | ||||||||||||
and the logical iteration of the [Logical Source]() of the [Referencing Object Map]()'s [Parent Triples Map](), and | ||||||||||||
retaining only the combination of those logical iterations for which the values of the [Child Map]() and [Parent Map]() of each join condition are identical. | ||||||||||||
|
||||||||||||
**NOTE** | ||||||||||||
If the [Referencing Object Map]() has no join condition and the [Logical Source]() of the [Triples Map]() that contains the [Referencing Object Map]() | ||||||||||||
and the [Logical Source]() of the [Referencing Object Map]()'s [Parent Triples Map]() are not identical, the mapping engine MUST report an error. | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Doesn't this break R2RML support? AFAIK you can join without a condition which results in joining everything from LS1 with everything of LS2 (Cartesian product) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, R2RML states:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AFAIK most engines allow this, thus violating the spec? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I would say that is a violation of the spec. We could of course argue about the usefulness of such behavior. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmm this is of course not enforced in the shapes, it is kinda hard to do that I think. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pmaria @DylanVanAssche If we decide to move away from the R2RML spec, I wonder why we still need the exception for 'same logical source'. It would be much clearer if no join condition means cartesian product for any join. Since no such decisions were taken until now, I tried to write a PR in line with R2RML spec. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would vote for the cartesian product: given that most engines implement it as such, it feels like it's the more intuitive interpretation of 'no join condition', and I'm all for increasing intuitivity! :). And we can see it as an extension of R2RML: cartesian allows you to do "more" than when you throw an error There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FYI. R2RML --> if the queries of two different triples maps use different attributes for the generation of subject maps, then you must have a join condition (in other words, you must do a theta-join). When no join conditions are provided, then the rows of the child queries are used to populate both child and parent subject maps. In that sense, the no-join-conditions case simulates a natural join.
This is sufficient, and the quote Pano mentioned is confusing. Testing the equivalence of two queries is "ignored" by the community. It was even the subject of a thread a while ago. Are
elsdvlee marked this conversation as resolved.
Show resolved
Hide resolved
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This correction makes sense, but the whole sentence doesn't IMO.
I don't think we can say that references exists in a logical source. And I also don't think it's a MUST.
I think the sentence should be something along the lines of:
And then there should be explanation about what a
rml:ChildMap
is.Whether or not the child map expression resolves or not is not really a concern for the spec IMO.