Merge pull request #29 from chrisfilo/enh/cleanup

Cleanup
bids-standard · Oct 4, 2018 · d967a33 · d967a33
2 parents bafcab1 + e4d8e71
commit d967a33
Show file tree

Hide file tree

Showing 19 changed files with 342 additions and 2,310 deletions.
diff --git a/specification.md b/specification.md
diff --git a/src/01_introduction.md b/src/01_introduction.md
@@ -1,8 +1,6 @@
-Introduction
-============
+# Introduction
 
-Motivation
-----------
+## Motivation
 
 Neuroimaging experiments result in complicated data that can be arranged in many different ways. So far there is no consensus how to organize and share data obtained in neuroimaging experiments. Even two researchers working in the same lab can opt to arrange their data in a different way. Lack of consensus (or a standard) leads to misunderstandings and time wasted on rearranging data or rewriting scripts expecting certain structure. Here we describe a simple and easy-to-adopt way of organising neuroimaging and behavioural data. By using this standard you will benefit in the following ways:
 
@@ -13,8 +11,7 @@ Neuroimaging experiments result in complicated data that can be arranged in many
 
 BIDS is heavily inspired by the format used internally by OpenfMRI.org and has been supported by the International Neuroinformatics Coordinating Facility and the Neuroimaging Data Sharing Task Force. While working on BIDS we consulted many neuroscientists to make sure it covers most common experiments, but at the same time is intuitive and easy to adopt. The specification is intentionally based on simple file formats and folder structures to reflect current lab practices and make it accessible to a wide range of scientists coming from different backgrounds.
 
-Definitions
------------
+## Definitions
 
 The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in
@@ -31,14 +28,12 @@ Throughout this protocol we use a list of terms. To avoid misunderstanding we cl
 7.  Event - a stimulus or subject response recorded during a task. Each event has an onset time and duration. Note that not all tasks will have recorded events (e.g., resting state).
 8.  Run - an uninterrupted repetition of data acquisition that has the same acquisition parameters and task (however events can change from run to run due to different subject response or randomized nature of the stimuli). Run is a synonym of a data acquisition.
 
-Compulsory, optional, and additional data and metadata
-------------------------------------------------------
+## Compulsory, optional, and additional data and metadata
 
 The following standard describes a way of arranging data and writing down metadata for a subset of neuroimaging experiments. Some aspects of the standard are compulsory. For example a particular file name format is required when storing structural scans. Some aspects are regulated but optional. For example a T2 volume does not need to be included, but when it is available it should be saved under a particular file name specified in the standard.
 This standard aspires to describe a majority of datasets, but acknowledges that there will be cases that do not fit. In such cases one can include additional files and subfolders to the existing folder structure following common sense. For example one may want to include eye tracking data in a vendor specific format that is not covered by this standard. The most sensible place to put it is next to the continuous recording file with the same naming scheme but different extensions. The solutions will change from case to case and publicly available datasets will be reviewed to include common data types in the future releases of the BIDS spec.
 
-Source vs. raw vs. derived data
--------------------------------
+## Source vs. raw vs. derived data
 
 BIDS in its current form is designed to harmonize and describe raw (unprocessed or minimally processed due to file format conversion) data. During analysis such data will be transformed and partial as well as final results will be saved. Derivatives of the raw data (other than products of DICOM to NIfTI conversion) MUST be kept separate from the raw data. This way one can protect the raw data from accidental changes by file permissions. In addition it is easy to distinguish partial results from the raw data and share the latter. Similar rules apply to source data which is defined as data before harmonization and/or file format conversion (for example E-Prime event logs or DICOM files).
 
@@ -52,8 +47,7 @@ This specification currently does not go into details of recommending a particul
 3.  We RECOMMEND including the PDF print-out with the actual sequence
     parameters generated by the scanner in the  `sourcedata` folder.
 
-The Inheritance Principle
--------------------------
+## The Inheritance Principle
 
 Any metadata file (`.json`, `.bvec`, `.tsv`, etc.) may be defined at any directory level, but no more than one applicable file may be defined at a given level (Example 1).  The values from the top level are inherited by all lower levels unless they are overridden by a file at the lower level. For example, `sub-*_task-rest_bold.json` may be specified at the participant level, setting TR to a specific value. If one of the runs has a different TR than the one specified in that file, another `sub-*_task-rest_bold.json` file can be placed within that specific series directory specifying the TR for that specific run.
 There is no notion of "unsetting" a key/value pair. For example if there is a JSON file corresponding to particular participant/run defining a key/value and there is a JSON file on the root level of the dataset that does not define this key/value it will not be "unset" for all subjects/runs.
@@ -108,8 +102,7 @@ In the above example, the fields from `task-xyz_acq-test1_bold.json` file will a
 the new value will be applicable for that particular run/task NIfTI
 file/s.
 
-Extensions
-----------
+## Extensions
 
 The BIDS specification can be extended in a backwards compatible way  and will evolve over time. A number of extensions are currently being worked on:
 
@@ -140,8 +133,7 @@ When an extension reaches maturity it is merged into the main body of the specif
 Guide](https://docs.google.com/document/d/1pWmEEY-1-WuwBPNy5tDAxVJYQ9Een4hZJM06tQZg8X4/edit?usp%3Dsharing&sa=D&ust=1537468908724000)
 All of the ideas that are not backwards compatible and thus will have to wait for BIDS 2.0 are listed [here](https://docs.google.com/document/d/1LEgsMiisGDe1Gv-hBp1EcLmoz7AlKj6VYULUgDD3Zdw)
 
-Citing BIDS
------------
+## Citing BIDS
 
 When referring to BIDS in context of academic literature please
 cite:

diff --git a/src/02_common_principles.md b/src/02_common_principles.md
@@ -1,8 +1,6 @@
-Common principles
-=================
+# Common principles
 
-The Inheritance Principle
--------------------------
+## The Inheritance Principle
 
 Any metadata file (`.json`, `.bvec`, `.tsv`, etc.) may be defined at any directory level, but no more than one applicable file may be defined at a given level (Example 1).  The values from the top level are inherited by all lower levels unless they are overridden by a file at the lower level. For example, `sub-*_task-rest_bold.json` may be specified at the participant level, setting TR to a specific value. If one of the runs has a different TR than the one specified in that file, another `sub-*_task-rest_bold.json` file can be placed within that specific series directory specifying the TR for that specific run.
 There is no notion of "unsetting" a key/value pair. For example if there is a JSON file corresponding to particular participant/run defining a key/value and there is a JSON file on the root level of the dataset that does not define this key/value it will not be "unset" for all subjects/runs.
@@ -57,8 +55,7 @@ In the above example, the fields from `task-xyz_acq-test1_bold.json` file will a
 the new value will be applicable for that particular run/task NIfTI
 file/s.
 
-File Formation specification
-----------------------------
+## File Formation specification
 
 ### Imaging files
 
@@ -71,7 +68,7 @@ NIfTI header.
 
 Tabular data MUST be saved as tab delimited values (`.tsv`) files, i.e. csv files where commas are replaced by tabs. Tabs MUST  be true tab characters and MUST NOT be a series of space characters. Each TSV file MUST start with a header line listing the names of all columns (with the exception of physiological and other continuous acquisition data - see below for details). Names MUST be separated with tabs. String values containing tabs MUST be escaped using double quotes. Missing and non-applicable values MUST be coded as `n/a`.
 
-#### Example:
+Example:
 ```
 onset duration  response_time correct stop_trial  go_trial
 200 200 0 n/a n/a n/a
@@ -87,7 +84,7 @@ Tabular files MAY be optionally accompanied by a simple data dictionary in a JSO
 | Units       | Measurement units.  `[<prefix symbol>] <unit symbol>` format following the SI standard is RECOMMENDED (see Appendix V). |
 | TermURL     | URL pointing to a formal definition of this type of data in an ontology available on the web. |
 
-#### Example:
+Example:
 
 ```JSON
 {
@@ -109,13 +106,12 @@ Tabular files MAY be optionally accompanied by a simple data dictionary in a JSO
 }
 ```
 
-Key/value files (dictionaries)
-------------------------------
+## Key/value files (dictionaries)
 
 JavaScript Object Notation (JSON) files MUST be used for storing key/value pairs. Extensive documentation of the format can be found here: [http://json.org/](http://json.org/).  Several editors have built-in support for JSON syntax highlighting that aids manual creation of such files. An online editor for JSON with built-in validation is available at: [http://jsoneditoronline.org](http://jsoneditoronline.org). JSON
 files MUST be in UTF-8 encoding.
 
-### Example:
+Example:
 ```JSON
 {
   "RepetitionTime": 3,
@@ -187,15 +183,3 @@ sub-control01/
 ```
 
 Additional files and folders containing raw data may be added as needed for special cases.  They should be named using all lowercase with a name that reflects the nature of the scan (e.g., `calibration`).  Naming of files within the directory should follow the same scheme as above (e.g., `sub-control01_calibration_Xcalibration.nii.gz`)
-
-### Code
-
-Template:
-`code/*`
-
-Source code of scripts that were used to prepare the dataset (for example if it was anonymized or defaced) MAY be stored here.<sup>1</sup> Extra care should be taken to avoid including original IDs or any identifiable information with the source code. There are no limitations or recommendations on the language and/or code organization of these scripts at the moment.
-
-<sup>1</sup>Storing actual source files with the data
-is preferred over links to external source repositories to maximize long
-term preservation (which would suffer if an external repository would
-not be available anymore).
diff --git a/src/03_modality_agnostic_files.md b/src/03_modality_agnostic_files.md
@@ -1,8 +1,6 @@
-Modality-agnostic files
-=======================
+# Modality-agnostic files
 
-Dataset description
--------------------
+## Dataset description
 
 Template: `dataset_description.json` `README` `CHANGES`
 
@@ -65,8 +63,7 @@ Example:
  - Initial release.
 ```
 
-Participants file
------------------
+## Participants file
 
 Template:
 ```
@@ -119,8 +116,7 @@ correspond to individual columns.
 
 In addition to the keys available to describe columns in all tabular files (`LongName`, `Description`, `Levels`, `Units`, and `TermURL`) the `participants.json` file as well as phenotypic files can also include column descriptions with `Derivative` field that, when set to true, indicates that values in the corresponding column is a transformation of values from other columns (for example a summary score based on a subset of items in a questionnaire).
 
-Scans file
-----------
+## Scans file
 
 Template:
 ```
@@ -141,3 +137,15 @@ filename  acq_time
 func/sub-control01_task-nback_bold.nii.gz 1877-06-15T13:45:30
 func/sub-control01_task-motor_bold.nii.gz 1877-06-15T13:55:33
 ```
+
+## Code
+
+Template:
+`code/*`
+
+Source code of scripts that were used to prepare the dataset (for example if it was anonymized or defaced) MAY be stored here.<sup>1</sup> Extra care should be taken to avoid including original IDs or any identifiable information with the source code. There are no limitations or recommendations on the language and/or code organization of these scripts at the moment.
+
+<sup>1</sup>Storing actual source files with the data
+is preferred over links to external source repositories to maximize long
+term preservation (which would suffer if an external repository would
+not be available anymore).