-
Notifications
You must be signed in to change notification settings - Fork 3
Home
My reading of Appendix B of RFC9277 is that the tag content is always bstr
-wrapped.
In particular, the examples in Appendix B.1 illustrate CBOR types wrapped in bstr
s.
The rationale is:
"[...] a byte string is used as the type of the tag content because a media type representation in general can be any byte string."
It is unclear whether that is a requirement: the appendix contains no RFC2119 language, but all the clues (existing prose and examples) point in that direction.
Therefore, the current draft version (-08) has defined the encoding of TN-derived CBOR tag content under such an assumption.
Instead, usual tags (i.e., those registered using the normal procedure) behave as defined by their spec. In most (all?) the cases it seems reasonable to assume that CBOR content will be as-is, whilst non-CBOR content will be bstr
-wrapped.
The following table summarises the current encoding rules:
Tag type | CBOR value encoding | !CBOR value encoding |
---|---|---|
TN | bstr(val) | bstr(bstr(val)) |
usual | val | bstr(val) |
The encoding for TN tags is wasteful, but at least it's regular.
In an email to the RATS mailing list, Carl proposes an optimisation, which removes the extra bstr
-wrap for non-CBOR values that are wrapped in TN-derived tags:
cbor-tagged-cbor<tn, fmt> = #6.<tn>(bytes .cbor fmt)
cbor-tagged-data<tn> = #6.<tn>(bytes)
which would lead to this state:
Tag type | CBOR value encoding | !CBOR value encoding |
---|---|---|
TN | bstr(val) | bstr(val) |
usual | val | bstr(val) |
The encoding for TN tags is slightly less wasteful, but the overall encoding becomes a bit irregular.
Dropped due to this email from Carsten
If we could instead assume that Appendix B of RFC9277 is just illustrative rather than normative and that TN tags, when used in the CMW context, behave as the CMW spec says they do, we could get the best along both dimensions (efficiency and regularity):
Tag type | CBOR value encoding | !CBOR value encoding |
---|---|---|
TN | val | bstr(val) |
usual | val | bstr(val) |
In (the second part of) another email to the RATS mailing list, Laurence argues that everything in the tag should be bstr
-wrapped, in all cases:
Tag type | CBOR value encoding | !CBOR value encoding |
---|---|---|
TN | bstr(val) | bstr(val) |
usual | bstr(val) | bstr(val) |
This has the advantage of being regular, and equivalent to the array form of CMW, but how would that work with "usual" tags associated with CBOR content (e.g., UCCS)? Should we disallow "usual" tags that are not bstr
-wrapped?