Skip to content
Jason Walonoski edited this page Jun 12, 2019 · 7 revisions

Let's Build a Synthetic Dataset

In a classroom setting, the instructor will walk you through the process of using, configuring, and customizing SyntheaTM – an open-source synthetic patient generator – to create a patient and provider cohort and all the FHIR DSTU2, STU3, or R4 resources you need for your next software development, testing or integration event…

Prerequisites

Clone and Build the Repository

git clone https://github.com/synthetichealth/synthea.git
cd synthea
./gradlew test

Estimate 5-10 minutes

Browse the Diseases

Browse the diseases in the Module Builder

The instructor will make a change to the CHF module. Follow along.

Download the JSON, and overwrite your current module with the changes.

cp ~/Downloads/congestive_heart_failure.json ./src/main/resources/modules/congestive_heart_failure.json
./run_synthea -p 10

When finished, undo the changes:

git checkout -- src

Run Synthea

./run_synthea –s 0

Look at the FHIR record in ./output/fhir

Configuring Properties

Examine the configurable properties in the synthea.properties file.

POST our patient record to a server

Make sure you adjust the path in the command below to match your path and FHIR file name.

curl http://hapi.fhir.org/baseR4 --data-binary "@/Users/Path/synthea/output/fhir/Brigitte394_Jaskolski867_f45fe01f-de94-4587-97f0-4cbd39f775c5.json" -H "Content-Type: application/fhir+json" 

Generate a lot of data

Choose the State you want. Feel free to also specify a city. States or cities with spaces in the name need to be quoted.

./run_synthea -p 10000 Washington

Estimate 5-10 minutes and 5-10 GB storage

POST all the data to a server

Do not do this step for the tutorial today. Skip due to time constraints...

cd output/fhir/
for file in *; do curl --write-out '.' http://hapi.fhir.org/baseR4 --data-binary "@$file" -H "Content-Type: application/fhir+json" > /dev/null; done;

Estimate: a long time... too long.

Upload all the data to Microsoft Azure

Zip your ./output/fhir folder into fhir_{p}_{state}_{city}.zip.

Get the azcopy utility to upload your data to Microsoft Azure:

Unpack the download to use the azcopy executable -- no installation should be required.

Copy your file to our Azure storage:

azcopy cp "output/fhir_{p}_{state}_{town}.zip" "https://syntheadevdays2019.blob.core.windows.net/synthea/?sv=2018-03-28&ss=bfqt&srt=sco&sp=rwdlacup&se=2019-06-22T04:32:42Z&st=2019-06-10T20:32:42Z&spr=https&sig=7mG96EHAJ1jIVlShwxxui5g74%2F%2F6enrCXjCx%2BteM0k0%3D" --recursive=true

List all the data files:

azcopy list https://syntheadevdays2019.blob.core.windows.net/synthea

Download a particular data {file}:

azcopy cp https://syntheadevdays2019.blob.core.windows.net/synthea/{file} .

Thank you!

Have a great #FHIRDevDays

Clone this wiki locally