diff --git a/ARC specification.md b/ARC specification.md index 490dcda..79afd49 100644 --- a/ARC specification.md +++ b/ARC specification.md @@ -156,7 +156,7 @@ The `study` file MUST follow the [ISA-XLSX study file specification](ISA-XLSX.md Protocols that are necessary to describe the sample or material creating process can be placed under the protocols directory. -Further explications about data entities defined in the assay MAY be stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.datamap.xlsx` file, which SHOULD exist for each assay. Further details on `isa.datamap.xlsx` are specified [in the isa-xlsx specification](ISA-XLSX.md#datamap-file). +Further explications about data entities defined in the study MAY be stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.datamap.xlsx` file, which SHOULD exist for each study. Further details on `isa.datamap.xlsx` are specified [in the isa-xlsx specification](ISA-XLSX.md#datamap-file). ## Assay Data and Metadata diff --git a/ISA-XLSX.md b/ISA-XLSX.md index b7616a9..2e13472 100644 --- a/ISA-XLSX.md +++ b/ISA-XLSX.md @@ -692,29 +692,61 @@ In this example, there is a measurement of two `Samples`, namely `input1` and `i ## Ontology Annotations -Where a value is an `Ontology Annotation` in a table file, `Term Accession Number` and `Term Source REF` fields MUST follow the column cell in which the value is entered. These two columns SHOULD contain further ontological information about the header. In this case, following the static header string, separated by a single space, there MUST be a short ontology term identifier formatted as CURIEs (prefixed identifiers) of the form `:` (specified [here](http://obofoundry.org/id-policy)) inside `()` brackets. +Where a value is an `Ontology Annotation` in an annotation table, `Term Accession Number` and `Term Source REF` columns MUST follow the main column. + +An `Ontology Annotation` MAY be applied to any appropriate `Characteristic`, `Parameter`, `Factor`, `Component` or `Protocol Type`. + +This implements `Ontology Annotation` from the ISA Abstract Model. + +#### Ontology Annotation Headers + +The header of the main column MUST contain the structural column type followed by the `name` of the ontology term in `[]` brackets. +There SHOULD be a `space` between the column type and the `[` bracket. + +The headers of the two annotation columns SHOULD contain further ontological information about the ontology term of the main header. +In this case, following the static header string, separated by a single space, there MUST be a short ontology term identifier formatted as CURIEs (prefixed identifiers) of the form `:` (specified [here](http://obofoundry.org/id-policy)) inside `()` brackets. + +In the other case, i.e. when the annotation columns do not contain further ontological information, the static header strings MUST be either followed by a single space and empty `()` brackets or nothing. + +#### Ontology Annotation Values + +The value in the main column MUST contain the name of the ontology term. + +The value in the `Term Source REF` column MUST either contain a short identifier for the `IDSPACE`, which identifies the ontology containing the term, or be left empty. + +The value in the `Term Accession Number` column MUST either contain a value formatted in one of the following formats, or be left empty: + - `LOCALID` of the ontology, which is only applicable if the matching `IDSPACE` is given in the `Term Source REF` column + - short ontology term identifier formatted as CURIEs (prefixed identifiers) of the form `:` (specified [here](http://obofoundry.org/id-policy)) + - `URL` pointing to the ontology term + +#### Ontology Annotation Example + For example, a characteristic type `organism` with a value of `Homo sapiens` can be qualified with an `Ontology Annotation` of a term from NCBI Taxonomy as follows: | Characteristic [organism] | Term Source REF (OBI:0100026) | Term Accession Number (OBI:0100026) | |-----------------------------|-------------------|------------------------------------------------------| -| Homo sapiens | NCBITaxon | [http://…/NCBITAXON_9606](http://.../NCBITAXON_9606) | - -An `Ontology Annotation` MAY be applied to any appropriate `Characteristic`, `Parameter`, `Factor`, `Component` or `Protocol Type`. +| Homo sapiens | NCBITaxon | [http://…/NCBITAXON_9606](http://purl.obolibrary.org/obo/NCBITAXON_9606) | -This implements `Ontology Annotation` from the ISA Abstract Model. +> [!NOTE] +> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting. ## Unit -Where a value is numeric, a `Unit` MAY be used to qualify the quantity. In this case, following the column in which a `Unit` -is used, a `Unit` heading MUST be present, and MAY be further annotated as an [`Ontology Annotation`](#ontology-annotations). +Where a value is numeric, a `Unit` MAY be used to qualify the quantity. +In this case, the main column must be followed by a `Unit` column, which in turn SHOULD be further annotated as an [`Ontology Annotation`](#ontology-annotations), being followed by `Term Accession Number` and `Term Source REF` columns. + +- The headers of the annotation columns then refer to the header of the main column. +- The values of the annotation columns then refer to the unit, and not to the numeric value of the main column. -For example, to qualify the value `300` with a `Unit` `Kelvin` qualified as an [`Ontology Annotation`](#ontology-annotations) from the Units Ontology declared -in the Ontology Sources with `UO`: +For example, in the following, the header ontology `temperature` is further qualified with the CURIE `PATO:0000146`. +The value `300` is qualified with a `Unit` `Kelvin`, which is further qualified as an [`Ontology Annotation`](#ontology-annotations) from the Units Ontology declared in the Ontology Sources with `UO`: | Parameter [temperature] | Unit | Term Source REF (PATO:0000146) | Term Accession Number (PATO:0000146) | |--------------------------------|--------|-------------------|------------------------------------------------------| -| 300 | Kelvin | UO | [http://…/obo/UO_0000012](http://.../obo/UO_0000012) | +| 300 | Kelvin | UO | [http://…/obo/UO_0000012](http://purl.obolibrary.org/obo/UO_0000012) | +> [!NOTE] +> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting. ## Characteristics @@ -728,6 +760,9 @@ For example, a characteristic type Organism with a value of Homo sapiens can be |-------------------------------|-------------------|-------------------------| | Liver | MeSH | D008099 | +> [!NOTE] +> In this example, the value in the `Term Accession Number` column is formatted as a `LOCALID`. The associated `IDSPACE` to identify the ontology term is given in the `Term Source REF` column. + ## Factors A `Factor` is an independent variable manipulated by an experimentalist with the intention to affect biological systems in a way that can be measured by an assay. This field holds the actual data for the `Factor` named between the square brackets (as declared in the `Study Factors` section of a top-level metadata sheet) so MUST match, for example, `Factor [compound]`. The value MUST be free text, numeric, or an [`Ontology Annotation`](#ontology-annotations). @@ -736,23 +771,33 @@ A `Factor` is an independent variable manipulated by an experimentalist with the |------------------------|-------------------|-------------------------| | Male | MeSH | D008297 | +> [!NOTE] +> In this example, the value in the `Term Accession Number` column is formatted as a `LOCALID`. The associated `IDSPACE` to identify the ontology term is given in the `Term Source REF` column. ## Components A `Component` is a consumable or reusable physical entity used in the experimental workflow. It is formatted in the pattern `Component []`. The value MUST be free text, numeric, or an [`Ontology Annotation`](#ontology-annotations). -| Component [Measurement Device] | Term Source REF (NCIT_C81182) | Term Accession Number (NCIT_C81182) | +| Component [Measurement Device] | Term Source REF (NCIT:C81182) | Term Accession Number (NCIT:C81182) | |------------------------|-------------------|-------------------------| | Illumina MiniSeq | OBI | [http://…/obo/OBI_0003114](http://purl.obolibrary.org/obo/OBI_0003114) | +> [!NOTE] +> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting. + + ## Parameters A `Parameter` can be used to specify any additional information about the experimental setup, that does not fall under the aforementioned 3 categories. It is formatted in the pattern `Parameter []`. The value MUST be free text, numeric, or an [`Ontology Annotation`](#ontology-annotations). -| Parameter [time] | Unit | Term Source REF (PATO_0000165) | Term Accession Number (PATO:0000165) | +| Parameter [temperature] | Unit | Term Source REF (NCRO:0000029) | Term Accession Number (NCRO:0000029) | |--------------------------------|--------|-------------------|------------------------------------------------------| | 300 | Kelvin | UO | [http://…/obo/UO_0000032](http://purl.obolibrary.org/obo/UO_0000032) | +> [!NOTE] +> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting. + + ## Comments A `Comment` can be used to provide some additional information. Columns headed with `Comment[]` MAY appear anywhere in the Annotation Table. The comment always refers to the Annotation Table. The value MUST be free text. @@ -824,6 +869,9 @@ Every `Datamap Table sheet` SHOULD contain an `Unit` column. The `Unit` adds a u |------------------------|-------------------|-------------------------| | milligram per milliliter | UO | [http://…/obo/UO_0000176](http://purl.obolibrary.org/obo/UO_0000176) | +> [!NOTE] +> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting. + ## Object Type column Every `Datamap Table sheet` SHOULD contain an `Object Type` column. The `Object Type` defines the shape or format in which the data node is represented. The value MUST be free text, or an [`Ontology Annotation`](#ontology-annotations). @@ -832,6 +880,9 @@ Every `Datamap Table sheet` SHOULD contain an `Object Type` column. The `Object |------------------------|-------------------|-------------------------| | Float | NCIT | [http://…/obo/NCIT_C48150](http://purl.obolibrary.org/obo/NCIT_C48150) | +> [!NOTE] +> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting. + ## Description column Every `Datamap Table sheet` SHOULD contain a `Description` column. The `Description` gives additional, humand readable context about the data node. The value MUST be free text.