Skip to content
Thomas Fossati edited this page Oct 14, 2024 · 8 revisions

CBOR tag content encoding

My reading of Appendix B of RFC9277 is that the tag content is always bstr-wrapped.

In particular, the examples in Appendix B.1 illustrate CBOR types wrapped in bstrs.

The rationale is:

"[...] a byte string is used as the type of the tag content because a media type representation in general can be any byte string."

It is unclear whether that is a requirement: the appendix contains no RFC2119 language, but all the clues (existing prose and examples) point in that direction.

Therefore, the current draft version (-08) has defined the encoding of TN-derived CBOR tag content under such an assumption.

Instead, usual tags (i.e., those registered using the normal procedure) behave as defined by their spec. In most (all?) the cases it seems reasonable to assume that CBOR content will be as-is, whilst non-CBOR content will be bstr-wrapped.

The following table summarises the current encoding rules:

Tag type CBOR value encoding !CBOR value encoding
TN bstr(val) bstr(bstr(val))
usual val bstr(val)

The encoding for TN tags is wasteful, but at least it's regular.

Carl's proposal

In an email to the RATS mailing list, Carl proposes an optimisation, which removes the extra bstr-wrap for non-CBOR values that are wrapped in TN-derived tags:

cbor-tagged-cbor<tn, fmt> = #6.<tn>(bytes .cbor fmt)
cbor-tagged-data<tn> = #6.<tn>(bytes)

which would lead to this state:

Tag type CBOR value encoding !CBOR value encoding
TN bstr(val) bstr(val)
usual val bstr(val)

The encoding for TN tags is slightly less wasteful, but the overall encoding becomes a bit irregular.

Ideal

Dropped due to this email from Carsten

If we could instead assume that Appendix B of RFC9277 is just illustrative rather than normative and that TN tags, when used in the CMW context, behave as the CMW spec says they do, we could get the best along both dimensions (efficiency and regularity):

Tag type CBOR value encoding !CBOR value encoding
TN val bstr(val)
usual val bstr(val)

Laurence's take

In (the second part of) another email to the RATS mailing list, Laurence argues that everything in the tag should be bstr-wrapped, in all cases:

Tag type CBOR value encoding !CBOR value encoding
TN bstr(val) bstr(val)
usual bstr(val) bstr(val)

This has the advantage of being regular, and equivalent to the array form of CMW, but how would that work with "usual" tags associated with CBOR content (e.g., UCCS)? Should we disallow "usual" tags that are not bstr-wrapped?

Clone this wiki locally