Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WARC 1.1: Introduce record-id BNF grammar rule for consistency with examples #24

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions specifications/warc-format/warc-1.1/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -339,7 +339,8 @@ LWS = [CRLF] 1*( SP | HT ) ; semantics same as
quoted-string = ( <"> *(qdtext | quoted-pair ) <"> )
qdtext = <any TEXT except <">>
quoted-pair = "\" CHAR ; single-character quoting
uri = "<" <'URI' per RFC3986> ">"
record-id = "<" uri ">"
uri = <'URI' per RFC3986>
~~~

Although UTF-8 characters are allowed, the 'encoded-word' mechanism of
Expand Down Expand Up @@ -432,7 +433,7 @@ conforms (e.g., via a URI scheme prefix such as "http:" or "urn:").
Care should be taken to ensure that this value is written with no
internal whitespace.

WARC-Record-ID = "WARC-Record-ID" ":" uri
WARC-Record-ID = "WARC-Record-ID" ":" record-id

All records shall have a WARC-Record-ID field.

Expand Down Expand Up @@ -519,7 +520,7 @@ automatically gathered by a retrieval against a single target-URI; for
example, it might be represented by a 'response' or 'revisit' record
plus its associated 'request' record.

WARC-Concurrent-To = "WARC-Concurrent-To" ":" uri
WARC-Concurrent-To = "WARC-Concurrent-To" ":" record-id

This field may be used to associate records of types 'request',
'response', 'resource', 'metadata', and 'revisit' with one another when
Expand Down Expand Up @@ -601,7 +602,7 @@ WARC-Refers-To
The WARC-Record-ID of a single record for which the present record holds
additional content.

WARC-Refers-To = "WARC-Refers-To" ":" uri
WARC-Refers-To = "WARC-Refers-To" ":" record-id

The WARC-Refers-To field may be used to associate a 'metadata' record to
another record it describes. The WARC-Refers-To field may also be used
Expand Down Expand Up @@ -667,7 +668,7 @@ such as after distributing single records into separate WARC files. WARC
writing applications (such web crawlers) may choose to always record
this parameter.

WARC-Warcinfo-ID = "WARC-Warcinfo-ID" ":" uri
WARC-Warcinfo-ID = "WARC-Warcinfo-ID" ":" record-id

The WARC-Warcinfo-ID field value overrides any association with a
previously occurring (in the WARC) 'warcinfo' record, thus providing a
Expand Down Expand Up @@ -743,7 +744,7 @@ Identifies the starting record in a series of segmented records whose
content blocks are reassembled to obtain a logically complete content
block.

WARC-Segment-Origin-ID = "WARC-Segment-Origin-ID" ":" uri
WARC-Segment-Origin-ID = "WARC-Segment-Origin-ID" ":" record-id

This field is mandatory on all 'continuation' records, and shall not be
used in other records. See the section below, Record segmentation, for
Expand Down