Skip to content

Formatting Data Files

wsy19961129 edited this page May 31, 2020 · 3 revisions

Formatting a data file to be uploaded to ITMAT-Broker.

This page will describes how to generate a data file that are accepted by ITMAT-Broker, e.g. the aggragation of subjects' data. We support two kinds of files, csv & json.

  • Define your data fields

    Before we adding subjects' data, we need to define the header, i.e., the definition of each field involved in the source data. For example, if there is a field 'age' from the source data, you have to define 'age' in the header.

    The header includes a pre-defined field, Eid that refers to the Id of each subject. This field are necessary and could not be omitted or changed (The position of this field differs in csv or json. See below).

    For your own fields, please use the following format:

    {fieldId}@{timepoint}.{measurement}:{datatype}, for example, 2@0.1:c represents a field with fieldId '2', timepoint '0', measurement '1' and datatype 'c' (categories). Or you can omit the datatype as {fieldId}@{timepoint}.{measurement} like 1@0.0. In this case, the datatype will be set to 'c' (categories) by default. The complete datatypes could be found in DataTypes.

  • Construction the data files.

    1. If you'd like to use csv format:

      Please settle down the header in the first line, where 'Eid' must be the first field, followed by your own fields. Each field is divided by tab (\t).

      You then is able to add your subjects' data starting from the second line, each subject for one line. An example is shown below:

      Eid 1@1.1 1@1.2:d 1@2.1:i 2@1.1
      subj0 "1" 2.1 2 "male"
      subj1 "1" 1.1 2 "female"
    2. If you'd like to use json format:

      2.1 The old version: using JSON Objects:

       First define your own fields (without 'Eid') in an array with key 'fields', e.g.,
       `"fields": ["1@1.1", "1@1.2:d", "1@2.1:i", "2@1.1"]`.
       the 'Eid' is used as the key for each subject, foe example, if you want to add a subject's data whose name is 'subj0', simply add a key with 'subj0' such as :
       `"subj0": ["1", 2.1, 2, "male"]`. A complete format would be like this:
      
       {
            "fields": ["1@1.1", "1@1.2:d", "1@2.1:i", "2@1.1"],
            "subj0": ["1", 2.1, 2, "male"],
            "subj1": ["1", 1.1, 2, "female"]
       }
      
       Note, the 'fields' key must appear first before your own fields as the system must know the field definition to check and process on your data. In Python, this can be done easily by adding the 'fields' key to your dictionary first before any other keys.
      

      2.2 The latest version (Not determined yet): using JSON arrays:

      We use a quite similar format to the csv files. First define the fields that are exactly the same as definition is csv files and save them to an array like ["Eid", "1@1.1", "1@1.2:d", "1@2.1:i", "2@1.1"] as header. Then for each subject, simply adding an array that includes all data for one subject following the order defined in header such as ["subj0", "1", 2.1, 2, "male"]. A complete sample would be like this:

       [
           ["Eid", "1@1.1", "1@1.2:d", "1@2.1:i", "2@1.1"],
           ["subj0", "1", 2.1, 2, "male"],
           ["subj1", "1", 1.1, 2, "female"]
       ]