Graf-EDA: Graph Features for Exploratory Data Analysis
Authors: Mirela Cazzolato1,2, Marco Antonio Gutierrez2, Caetano Traina Jr.1, Christos Faloutsos3, Agma J. M. Traina1
Affiliations: 1,2 Institute of Mathematics and Computer Science - University of São Paulo (ICMC-USP), 2 The Heart Institute (InCor) - University of São Paulo (HC-FMUSP), 3 Carnegie Mellon University (CMU)
This work was presented at the IEEE 36th International Symposium on Computer Based Medical Systems (CBMS) 2023 - June 22 - 24 - L'Aquila, Italy.
GraF-EDA is available for researchers and data scientists under the GNU General Public License. In case of publication and/or public use of the available data and code, as well as any resource derived from it, one should acknowledge its creators by citing the following paper.
[Cazzolato et al., 2022] CAZZOLATO, M. T.; GUTIERREZ, M.A.; TRAINA-JR., C.; FALOUTSOS, C.; TRAINA, A. J. M.. GraF-EDA: Graph Features for Exploratory Data Analysis. 2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS), L'Aquila, Italy, 2023, pp. 117-122. DOI: doi.org/10.1109/CBMS58004.2023.00202.
Bibtex:
@Inproceedings{10178818,
author={Cazzolato, Mirela T. and Gutierrez, Marco Antonio and Traina, Caetano and Faloutsos, Christos and Traina, Agma J.M.},
booktitle={2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS)},
title={Exploratory Data Analysis in Electronic Health Records Graphs: Intuitive Features and Visualization Tools},
year={2023}, volume={}, number={},
pages={117-122},
doi={10.1109/CBMS58004.2023.00202}}
To setup and use a virtual environment, type in the Terminal:
python -m venv grafeda_venv
source grafeda_venv/bin/activate
Install the requirements:
pip install -r requirements.txt
or
make prep
Run the app:
make demo
Feature extraction:
grafeda_extract_features.mov
Feature loading:
grafeda_data_loading.mov
EDA of extracted features:
grafeda_eda.mov
The Covid-19 Data Sharing Repository provides Electronic Health Record (EHR) data from hospitals of São Paulo state, Brazil. The EHRs were collected between 2020 and 2021 [1].
Following, we provide the features extracted and used in our work, for datasets: Complete, ds-BPSP, ds-Einstein, ds-HC, ds-HSL. For more details, please refer to the paper.
Download:
-
Table Patient
-- Attributes: Patient and Exam: link -
Table: Patient
-- Attributes: Treatment and Exam: link -
Table: Outcome
-- Attributes: Clinic and Outcome: link
[1] FAPESP. FAPESP COVID-19 Data Sharing/BR, Available from https://repositoriodatasharingfapesp.uspdigital.usp.br/. Accessed on March 10th 2023