Skip to content

Requirements and Deliverables

Michael Kallfelz edited this page Nov 30, 2020 · 4 revisions

This page collects the requirements and deliverables for the MIMIC IV to OMOP conversion.

In a previous project a conversion of the MIMIC III dataset to OMOP was facilitated.

We will build upon the experiences collected in this previous project, but as there are some changes in data model and the target architecture will be BigQuery on Google Cloud Platform, most of the logic will be built from scratch.

The development system will be maintained by Odysseus while the future production system will be maintained by PhysioNet.

The following requirements are in scope for the first phase:

  • OMOP CDM in BigQuery on GCP
  • ETL of MIMIC IV table sections Core, Hosp and ICU (Sections ED and CXR optional addition or part of second phase)
  • Waveform metadata processing and mapping to standard concepts, association of waveform raw data

The deliverables that have been defined in the project group are:

  • OMOP CDM (BQ) in PhysioNet environment with MIMIC IV Demo data (waveform part still to be determined)
  • OMOP CDM (BQ) in PhysioNet environment with MIMIC IV full data (access to OHDSI tool ATLAS using PhysioNet Google IAM)
  • GitHub repository with ETL logic, basic data consistency logic and custom mapping master files for reproducing the ETL from MIMIC IV into a BigQuery OMOP CDM instance (other DB specific ETLs subject to community effort)
  • Additional documentation for ETL specifications
  • Test Plan for User Acceptance Testing

Scope and Approach are also described in this slide deck.

Clone this wiki locally