Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MISC] Cleanup #29

Merged
merged 9 commits into from
Oct 4, 2018
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2,067 changes: 0 additions & 2,067 deletions specification.md

This file was deleted.

24 changes: 8 additions & 16 deletions src/01_introduction.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
Introduction
============
# Introduction

Motivation
----------
## Motivation

Neuroimaging experiments result in complicated data that can be arranged in many different ways. So far there is no consensus how to organize and share data obtained in neuroimaging experiments. Even two researchers working in the same lab can opt to arrange their data in a different way. Lack of consensus (or a standard) leads to misunderstandings and time wasted on rearranging data or rewriting scripts expecting certain structure. Here we describe a simple and easy-to-adopt way of organising neuroimaging and behavioural data. By using this standard you will benefit in the following ways:

Expand All @@ -13,8 +11,7 @@ Neuroimaging experiments result in complicated data that can be arranged in many

BIDS is heavily inspired by the format used internally by OpenfMRI.org and has been supported by the International Neuroinformatics Coordinating Facility and the Neuroimaging Data Sharing Task Force. While working on BIDS we consulted many neuroscientists to make sure it covers most common experiments, but at the same time is intuitive and easy to adopt. The specification is intentionally based on simple file formats and folder structures to reflect current lab practices and make it accessible to a wide range of scientists coming from different backgrounds.

Definitions
-----------
## Definitions

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in
Expand All @@ -31,14 +28,12 @@ Throughout this protocol we use a list of terms. To avoid misunderstanding we cl
7. Event - a stimulus or subject response recorded during a task. Each event has an onset time and duration. Note that not all tasks will have recorded events (e.g., resting state).
8. Run - an uninterrupted repetition of data acquisition that has the same acquisition parameters and task (however events can change from run to run due to different subject response or randomized nature of the stimuli). Run is a synonym of a data acquisition.

Compulsory, optional, and additional data and metadata
------------------------------------------------------
## Compulsory, optional, and additional data and metadata

The following standard describes a way of arranging data and writing down metadata for a subset of neuroimaging experiments. Some aspects of the standard are compulsory. For example a particular file name format is required when storing structural scans. Some aspects are regulated but optional. For example a T2 volume does not need to be included, but when it is available it should be saved under a particular file name specified in the standard.
This standard aspires to describe a majority of datasets, but acknowledges that there will be cases that do not fit. In such cases one can include additional files and subfolders to the existing folder structure following common sense. For example one may want to include eye tracking data in a vendor specific format that is not covered by this standard. The most sensible place to put it is next to the continuous recording file with the same naming scheme but different extensions. The solutions will change from case to case and publicly available datasets will be reviewed to include common data types in the future releases of the BIDS spec.

Source vs. raw vs. derived data
-------------------------------
## Source vs. raw vs. derived data

BIDS in its current form is designed to harmonize and describe raw (unprocessed or minimally processed due to file format conversion) data. During analysis such data will be transformed and partial as well as final results will be saved. Derivatives of the raw data (other than products of DICOM to NIfTI conversion) MUST be kept separate from the raw data. This way one can protect the raw data from accidental changes by file permissions. In addition it is easy to distinguish partial results from the raw data and share the latter. Similar rules apply to source data which is defined as data before harmonization and/or file format conversion (for example E-Prime event logs or DICOM files).

Expand All @@ -52,8 +47,7 @@ This specification currently does not go into details of recommending a particul
3. We RECOMMEND including the PDF print-out with the actual sequence
parameters generated by the scanner in the `sourcedata` folder.

The Inheritance Principle
-------------------------
## The Inheritance Principle

Any metadata file (`.json`, `.bvec`, `.tsv`, etc.) may be defined at any directory level, but no more than one applicable file may be defined at a given level (Example 1). The values from the top level are inherited by all lower levels unless they are overridden by a file at the lower level. For example, `sub-*_task-rest_bold.json` may be specified at the participant level, setting TR to a specific value. If one of the runs has a different TR than the one specified in that file, another `sub-*_task-rest_bold.json` file can be placed within that specific series directory specifying the TR for that specific run.
There is no notion of "unsetting" a key/value pair. For example if there is a JSON file corresponding to particular participant/run defining a key/value and there is a JSON file on the root level of the dataset that does not define this key/value it will not be "unset" for all subjects/runs.
Expand Down Expand Up @@ -108,8 +102,7 @@ In the above example, the fields from `task-xyz_acq-test1_bold.json` file will a
the new value will be applicable for that particular run/task NIfTI
file/s.

Extensions
----------
## Extensions

The BIDS specification can be extended in a backwards compatible way and will evolve over time. A number of extensions are currently being worked on:

Expand Down Expand Up @@ -140,8 +133,7 @@ When an extension reaches maturity it is merged into the main body of the specif
Guide](https://docs.google.com/document/d/1pWmEEY-1-WuwBPNy5tDAxVJYQ9Een4hZJM06tQZg8X4/edit?usp%3Dsharing&sa=D&ust=1537468908724000)
All of the ideas that are not backwards compatible and thus will have to wait for BIDS 2.0 are listed [here](https://docs.google.com/document/d/1LEgsMiisGDe1Gv-hBp1EcLmoz7AlKj6VYULUgDD3Zdw)

Citing BIDS
-----------
## Citing BIDS

When referring to BIDS in context of academic literature please
cite:
Expand Down
30 changes: 7 additions & 23 deletions src/02_common_principles.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
Common principles
=================
# Common principles

The Inheritance Principle
-------------------------
## The Inheritance Principle

Any metadata file (`.json`, `.bvec`, `.tsv`, etc.) may be defined at any directory level, but no more than one applicable file may be defined at a given level (Example 1). The values from the top level are inherited by all lower levels unless they are overridden by a file at the lower level. For example, `sub-*_task-rest_bold.json` may be specified at the participant level, setting TR to a specific value. If one of the runs has a different TR than the one specified in that file, another `sub-*_task-rest_bold.json` file can be placed within that specific series directory specifying the TR for that specific run.
There is no notion of "unsetting" a key/value pair. For example if there is a JSON file corresponding to particular participant/run defining a key/value and there is a JSON file on the root level of the dataset that does not define this key/value it will not be "unset" for all subjects/runs.
Expand Down Expand Up @@ -57,8 +55,7 @@ In the above example, the fields from `task-xyz_acq-test1_bold.json` file will a
the new value will be applicable for that particular run/task NIfTI
file/s.

File Formation specification
----------------------------
## File Formation specification

### Imaging files

Expand All @@ -71,7 +68,7 @@ NIfTI header.

Tabular data MUST be saved as tab delimited values (`.tsv`) files, i.e. csv files where commas are replaced by tabs. Tabs MUST be true tab characters and MUST NOT be a series of space characters. Each TSV file MUST start with a header line listing the names of all columns (with the exception of physiological and other continuous acquisition data - see below for details). Names MUST be separated with tabs. String values containing tabs MUST be escaped using double quotes. Missing and non-applicable values MUST be coded as `n/a`.

#### Example:
Example:
```
onset duration response_time correct stop_trial go_trial
200 200 0 n/a n/a n/a
Expand All @@ -87,7 +84,7 @@ Tabular files MAY be optionally accompanied by a simple data dictionary in a JSO
| Units | Measurement units. `[<prefix symbol>] <unit symbol>` format following the SI standard is RECOMMENDED (see Appendix V). |
| TermURL | URL pointing to a formal definition of this type of data in an ontology available on the web. |

#### Example:
Example:

```JSON
{
Expand All @@ -109,13 +106,12 @@ Tabular files MAY be optionally accompanied by a simple data dictionary in a JSO
}
```

Key/value files (dictionaries)
------------------------------
## Key/value files (dictionaries)

JavaScript Object Notation (JSON) files MUST be used for storing key/value pairs. Extensive documentation of the format can be found here: [http://json.org/](http://json.org/). Several editors have built-in support for JSON syntax highlighting that aids manual creation of such files. An online editor for JSON with built-in validation is available at: [http://jsoneditoronline.org](http://jsoneditoronline.org). JSON
files MUST be in UTF-8 encoding.

### Example:
Example:
```JSON
{
"RepetitionTime": 3,
Expand Down Expand Up @@ -187,15 +183,3 @@ sub-control01/
```

Additional files and folders containing raw data may be added as needed for special cases. They should be named using all lowercase with a name that reflects the nature of the scan (e.g., `calibration`). Naming of files within the directory should follow the same scheme as above (e.g., `sub-control01_calibration_Xcalibration.nii.gz`)

### Code

Template:
`code/*`

Source code of scripts that were used to prepare the dataset (for example if it was anonymized or defaced) MAY be stored here.<sup>1</sup> Extra care should be taken to avoid including original IDs or any identifiable information with the source code. There are no limitations or recommendations on the language and/or code organization of these scripts at the moment.

<sup>1</sup>Storing actual source files with the data
is preferred over links to external source repositories to maximize long
term preservation (which would suffer if an external repository would
not be available anymore).
24 changes: 16 additions & 8 deletions src/03_modality_agnostic_files.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
Modality-agnostic files
=======================
# Modality-agnostic files

Dataset description
-------------------
## Dataset description

Template: `dataset_description.json` `README` `CHANGES`

Expand Down Expand Up @@ -65,8 +63,7 @@ Example:
- Initial release.
```

Participants file
-----------------
## Participants file

Template:
```
Expand Down Expand Up @@ -119,8 +116,7 @@ correspond to individual columns.

In addition to the keys available to describe columns in all tabular files (`LongName`, `Description`, `Levels`, `Units`, and `TermURL`) the `participants.json` file as well as phenotypic files can also include column descriptions with `Derivative` field that, when set to true, indicates that values in the corresponding column is a transformation of values from other columns (for example a summary score based on a subset of items in a questionnaire).

Scans file
----------
## Scans file

Template:
```
Expand All @@ -141,3 +137,15 @@ filename acq_time
func/sub-control01_task-nback_bold.nii.gz 1877-06-15T13:45:30
func/sub-control01_task-motor_bold.nii.gz 1877-06-15T13:55:33
```

## Code

Template:
`code/*`

Source code of scripts that were used to prepare the dataset (for example if it was anonymized or defaced) MAY be stored here.<sup>1</sup> Extra care should be taken to avoid including original IDs or any identifiable information with the source code. There are no limitations or recommendations on the language and/or code organization of these scripts at the moment.

<sup>1</sup>Storing actual source files with the data
is preferred over links to external source repositories to maximize long
term preservation (which would suffer if an external repository would
not be available anymore).
Loading