This project aims to enable application of Machine Learning (ML), Deep Learning and Artificial Intelligence (AI) technology for cities, based on the creation and demonstration of a ‘Data Lake’, which combines legacy city data with real-time data feeds.
Git Repository | 🌐 Website |
---|
The main technical outputs of this project will be three advanced open-source data integration technology modules that combine and integrate legacy data and real-time data into a useable and highly secure framework. These modules will then be deployed in two Use-Cases, the first fully integrated Mobility - and the Environmental toolkit, taking advantage of the advancements and integration work during the first half of the project.
Activities carried out as part of this project include:
- Development, testing and implementation of the ‘Data Lake’, also related to with specific use cases (for example mobility data, water quality, air pollution) in named partner cities, based on CKAN , using various open source modules;
- Sharing of the Open-Source technology solution framework on GitHub, including tutorials;
- Dissemination of the results of the project in a series of Global and European smart city gatherings and major conferences.
Methodologies: To implement the objectives, the technical consortium members initially agreed on the architecture and deployment in the Frontrunner City Kiel, also the Coordinator of the project. This part took 24 months, because of some very influential challenges, such as Corona, the rejection of one important partner and several technical adjustments. The workplan was adjusted and also a prolongation of the project where agreed with the funding agency. The results of the first 2 years where then replicated in 4 Follower cities, and adjusted to the real circumstances (deployed digital infrastructure) in each and every city.
The Innovation Management tasks supported alongside the implementation and harvested feedback, enabled communication and learnings.
The assets delivered are a toolset for implementation of a fully functional data lake, with several integrations and data services. Follower cities are provided with a fully operational and individually adjusted version of the ODALA assets. The replication, for every city, is enabled via the GitLab repository and the scripts, guidelines and video explanation available online and open source or CC commons.
The results have impacted several other initiatives such like IDSA Data spaces distributed iteration, Gaia-X federated topologies, smart city projects in the member states as well as the FIWARE and OASC Member cities. The project contributed the results also to the ITU MIMs Plus Standardization, the LDES CIM Broker advancements. The results are available as open source components on GitHub.
ODALA is there to help implementing city data infrastructure, way beyond Data Portals. Data gathering is at the core for the data economy, and a data lake is the key infrastructure element, enabling access to all kinds of data
- hosted data (in databases)
- external data integrated via real-time API,
- internal data (so-called “silos”) integrated via real-time API
- secure access-control to data
- provide marketplace functionalities to trade data and data services
- provide semantic interoperability and searchability
thus enabling advanced data services, such as Artificial Intelligence Operation, to access relevant data sources.
Digital Twins needs a Data Lake to build for example the visualization services, Data Spaces needs access to Data Lakes to support consumer and provider of data to operate. Gaia-X is adding an interoperability layer to the Data Lakes, in order to assure data sovereignty.
Every Data Lake consists of several well-known and established components, amongst them the Broker, as defined in the CIS ETSI specification. The broker in the current version is also used to support access to data in distributed topologies, and runs in “Federation mode”, thus combining access to a variety of data sources via one single API. This also enabled control to specific search methodologies, and support the LDES concept (Linked Data Event Streams).
No city or community without a data lake! The key functionalities of the data lake are instrumental to keep the driver seat for cities, operating the core element of the digital city infrastructure. From there, other services can be integrated and operated by several and also independent digital service providers. The Data Lake is the spider in the middle, integrating and federating data and data services. The implementation is neither complicated nor costly, and secure sovereignty for the demand side, and smooth integration for the supply side: a win-win situation.