Replies: 2 comments
-
@idomingu , and you can express the situation where data is pulled from some source periodically and transformed into RDF using a particular transformation... something like the example here. |
Beta Was this translation helpful? Give feedback.
-
Hi! For integrating data changes, you might have a look at this paper under review: https://www.semantic-web-journal.net/content/incrml-incremental-knowledge-graph-construction-heterogeneous-data-sources To execute it periodically, I don't think RML itself should have extensions, this is more a job for the implementation as RML is a schema transformation. |
Beta Was this translation helpful? Give feedback.
-
There are scenarios that require periodic construction of the knowledge graph from logical sources like REST API or RDBMS.
Thus far, RML engines that pull data from sources (i.e., batch data sources) do this process as a one-time job. To support periodic batch jobs we have implemented scheduling mechanisms that leverage the features provided by the infrastructure. For example, a containerized RML engine like Morph-KGC is deployed on Kubernetes and then use the cron feature from k8s to orchestrate the scheduling of the container. This approach decouples RML from the technology used for the knowledge graph construction, however, it has bad effect in terms of data lineage.
Storing RML mappings together with the data in the knowledge graph helps us keep track of data lineage, and therefore, improve the data quality of the knowledge graph. In a scenario of periodic data integration, the knowledge graph would be missing the periodicity since it was defined outside RML (in Kubernetes in this case).
I see two options here:
Many thanks!
Beta Was this translation helpful? Give feedback.
All reactions