Skip to content

Commit

Permalink
Tidy up list
Browse files Browse the repository at this point in the history
Signed-off-by: Arthit Suriyawongkul <arthit@gmail.com>
  • Loading branch information
bact committed Aug 9, 2024
1 parent e6e7e52 commit c786c82
Showing 1 changed file with 58 additions and 28 deletions.
86 changes: 58 additions & 28 deletions docs/serializations.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,39 +2,69 @@

## 4.1 Overview <a name="4.1"></a>

This specification defines the data model of the SPDX standard, describing every piece of information about systems with software components. The data model is based on the Resource Description Framework (RDF) extensible knowledge representation data model, which provides a flexible and extensible way to represent and exchange information.

The data may be serialized in a variety of formats for storage and transmission.

## 4.2 RDF Serialization <a name="4.2"></a>

Since the data model is based on RDF, any SPDX data can be serialized in any of the multiple RDF serialization formats, including but not limited to:

- JSON-LD format as defined in [JSON-LD 1.1](https://www.w3.org/TR/json-ld11/);
- Turtle (Terse RDF Triple Language) format as defined in [RDF 1.1 Turtle](https://www.w3.org/TR/turtle/);
- N-Triples format as defined in [RDF 1.1 N-Triples](https://www.w3.org/TR/n-triples/); and
- RDF/XML format as defined in [RDF 1.1 XML Syntax](https://www.w3.org/TR/rdf-syntax-grammar/).

The SPDX specification is accompanied by a [JSON-LD context](https://www.w3.org/TR/json-ld11/#the-context) definition file that can be used to serialize SPDX in a much simpler and more human-readable JSON-LD format.
This specification defines the data model of the SPDX standard, describing
every piece of information about systems with software components. The data
model is based on the Resource Description Framework (RDF) extensible
knowledge representation data model, which provides a flexible and extensible
way to represent and exchange information.

The data may be serialized in a variety of formats for storage and
transmission.

## 4.2 RDF serialization <a name="4.2"></a>

Since the data model is based on RDF, any SPDX data can be serialized in any of
the multiple RDF serialization formats, including but not limited to:

- JSON-LD format as defined in
[JSON-LD 1.1](https://www.w3.org/TR/json-ld11/);
- Turtle (Terse RDF Triple Language) format as defined in
[RDF 1.1 Turtle](https://www.w3.org/TR/turtle/);
- N-Triples format as defined in
[RDF 1.1 N-Triples](https://www.w3.org/TR/n-triples/); and
- RDF/XML format as defined in
[RDF 1.1 XML Syntax](https://www.w3.org/TR/rdf-syntax-grammar/).

The SPDX specification is accompanied by a
[JSON-LD context](https://www.w3.org/TR/json-ld11/#the-context) definition file
that can be used to serialize SPDX in a much simpler and more human-readable
JSON-LD format.

## 4.3 Canonical serialization <a name="4.3"></a>

Canonical serialization is single, consistent, normalized, deterministic, and reproducible form.
Canonical serialization is a single, consistent, normalized, deterministic, and
reproducible form.

Such a canonical form normalizes things like ordering and formatting.

The content of the canonical serialization is exactly the same as the JSON-LD serialization of RDF data (see 4.2), just represented in a consistent way.

Canonical serialization is in JSON format, as defined in RFC 8259 (IETF STD 90), with the following additional characteristics:

- no line breaks
- key names MUST be wrapped in double quotes
- no whitespace outside of strings
- true, false and null: the literal names must be lowercase; no other literal names are allowed
- integers: represented in base 10 using decimal digits. This designates an integer component that may be prefixed with an optional minus sign. Leading zeros are not allowed.
- strings: UTF-8 representation without specific canonicalisation. A string begins and ends with quotation marks (%x22). Any Unicode characters may be placed within the quotation marks, except for the two characters that MUST be escaped by a reverse solidus: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).
- arrays: An array structure is represented as square brackets surrounding zero or more items. Items are separated by commas.
- objects: An object structure is represented as a pair of curly brackets surrounding zero or more name/value pairs (or members). A name is a string containing only ASCII characters (0x21-0x7F). The names within an object must be unique. A single colon comes after each name, separating the name from the value. A single comma separates a value from a following name. The name/value pairs are ordered by name.
The content of the canonical serialization is exactly the same as the JSON-LD
serialization of RDF data (see 4.2), just represented in a consistent way.

Canonical serialization is in JSON format, as defined in
[RFC 8259 (IETF STD 90)](https://www.rfc-editor.org/info/rfc8259),
with the following additional characteristics:

- No line breaks
- Key names MUST be wrapped in double quotes
- No whitespace outside of strings
- `true`, `false` and `null`: the literal names must be lowercase; no other
literal names are allowed
- Integers: represented in base 10 using decimal digits. This designates an
integer component that may be prefixed with an optional minus sign.
Leading zeros are not allowed.
- Strings: UTF-8 representation without specific canonicalisation. A string
begins and ends with quotation marks (%x22). Any Unicode characters may be
placed within the quotation marks, except for the two characters that MUST be
escaped by a reverse solidus: quotation mark, reverse solidus, and the
control characters (U+0000 through U+001F).
- Arrays: An array structure is represented as square brackets surrounding zero
or more items. Items are separated by commas.
- Objects: An object structure is represented as a pair of curly brackets
surrounding zero or more name/value pairs (or members). A name is a string
containing only ASCII characters (0x21-0x7F). The names within an object must
be unique. A single colon comes after each name, separating the name from the
value. A single comma separates a value from a following name. The name/value
pairs are ordered by name.

## 4.4 Serialization information <a name="4.4"></a>

Expand All @@ -51,7 +81,7 @@ more serialized forms of itself.

When serializing a physical SpdxDocument, any property of the logical element
that can be natively represented within the chosen serialization format
(e.g., @context prefixes in JSON-LD instead of the namespaceMap) may utilize
(e.g., `@context` prefixes in JSON-LD instead of the namespaceMap) may utilize
these native mechanisms. All remaining properties shall be serialized within
the SpdxDocument element itself.

Expand Down

0 comments on commit c786c82

Please sign in to comment.