Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document and improve abstract reader/writer interface #2

Closed

Conversation

sjlongland
Copy link

tinycbor fork branch thiagomacieira/dev adds support for abstract readers and writers, but the implementation had some limitations in cases where multiple CborValue cursors were iterating over the same CBOR document, causing state contamination. The interface also was not documented.

This documents the new interface and further builds upon it by slightly re-arranging CborParser members and opening the token in CborValue to use by the reader interface for any required purpose.

TSonono and others added 7 commits November 20, 2019 12:33
This method allows for writing raw data directly to the encoding buffer. This can be useful if you have something stored as CBOR encoded data.

Fixes intel#162.

Signed-off-by: Tofik Sonono <tofiksonono@msn.com>
Signed-off-by: phirsov <41143811+phirsov@users.noreply.github.com>
printf s.b. used for indentation because no newline at end is needed

Signed-off-by: phirsov <41143811+phirsov@users.noreply.github.com>
bytestring got escaped hex string literal output format to not to confuse e.g. "\x11" bytestring with 11 integer

Signed-off-by: phirsov <41143811+phirsov@users.noreply.github.com>
string got plain old C output format to avoid confusion e.g. "true" string with CBOR true value

Signed-off-by: phirsov <41143811+phirsov@users.noreply.github.com>
Signed-off-by: Mahavir Jain <mahavir@espressif.com>
@sjlongland sjlongland force-pushed the feature/WSHUB-455-chunked-codec branch from 5d24326 to 64299b9 Compare July 7, 2021 08:35
mcr and others added 22 commits September 3, 2021 11:27
…mes were unclear. containerEncoder could be the containing Encoder
Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
…nt data as single float

Motivation: half-precision floating point format is used to minimize storage
and traffic mostly.  Application level manipulates with single and double
precision usually. So, two routines added to public API to encode/decode given
single precision value in the half precision format

Signed-off-by: S.Phirsov
Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
The listing was outdated.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
The last version of Qt to support MSVC 2015 is no longer maintained, so
I'm skipping that. I'm therefore rededicating MSVC 2017 for 32-bit.

There's also no more no-tests build.
We don't need to compare the map lengths directly. The memcmp is
sufficient, since the source data is big endian.

Of course, verifying for sorting requires the map has known length.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
Because of all the inline functions, #include'ing <cbor.h> without
linking in all .c files may result in undefined symbol linker errors.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
QCOMPARE macro has a return in case of failure. If a test fails inside
the encodeOne function, we log that error, but were proceeding to
perform more tests (which could fail again and produce more errors).

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
As noted in the comment, we need to be sure we don't allow a length too
big from the stream to overflow and become smaller than the number of
bytes we're looking for.

This is also the first step in creating an API that reads from something
other than a linear buffer.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
This commit does not change all the readers yet, this is just the first
step.

As an interesting side-effect, we ended up reading the half-float into
it->extra and need not re-read it. The cbor_value_get_half_float()
function can be inlined in a later commit.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
We don't need to skimp on bits in the CborValue::flags, so save the fact
that the preparser found a 64-bit number in there. This saves us from
having to re-read the descriptor byte again in
_cbor_value_decode_int64_internal().

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
It was originally non-inline because I had thought of doing conversions
from half-float to float and onwards to double, but I never actually
made that in the API. Instead, even the get_float() and get_double(), we
only memcpy anyway and leave it up to the upper layer to convert, as
needed.

This change triggered an use-when-uninitialised false positive warning
that I needed to work around in the validator.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
The extract_length() function is only used in string context, so rename
accordingly.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
The extract_length() was the only case that called
extract_number_and_advance() without verifying we had a proper number. This
commit reworks the implementation so extract_number_and_advance() reuses
the number previously read by preparse_value(), and extract_length() gets
inlined to the only place that uses it.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
thiagomacieira and others added 23 commits September 3, 2021 13:08
Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
Instead of consuming it at the end of the last element of a map or array
of unknown length. This allows us to obtain the pointer to or offset of
the Break byte.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
We need to re-parse if the input buffer was too short to read the
current element's information. When that happens, the current element
will be CborInvalidType, so we can't easily resume.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
Instead of just one function (_cbor_value_get_string_chunk), we now have
_cbor_value_begin_string_iteration, _cbor_value_finish_string_iteration,
_cbor_value_get_string_chunk_size, and _cbor_value_get_string_chunk.

The "begin" function positions the pointer at the first chunk. That's
what makes "get_size" possible, since it doesn't need to check for any
state. The "finish" funcntion allows the caller to distinguish an error
parsing the string from an error parsing the next value.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
Signed-off-by: Dmitry Shachnev <mitya57@gmail.com>
Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
Instead of just one function (_cbor_value_get_string_chunk), we now have
_cbor_value_begin_string_iteration, _cbor_value_finish_string_iteration,
_cbor_value_get_string_chunk_size, and _cbor_value_get_string_chunk.

The "begin" function positions the pointer at the first chunk. That's
what makes "get_size" possible, since it doesn't need to check for any
state. The "finish" funcntion allows the caller to distinguish an error
parsing the string from an error parsing the next value.

Signed-off-by: Thiago Macieira <thiago.macieira@intel.com>
`cbor_parser_init` and `cbor_parser_init_reader` are substantially
similar, however the latter misses clearing out `it->flags`, leaving it
uninitialised so possibly unsafe.

Rather than copying & pasting that from `cbor_parser_init`, lets just
use one routine that does the "common" part, then each routine can focus
on the specifics needed.
Describe the input parameters for the function and how they are used as
best we understand from on-paper analysis of the C code.
The `token` parameter is not sufficient since it is effectively shared
by all `CborValue` instances.  Since `tinycbor` often uses a temporary
`CborValue` context to perform some operation, we need to store our
context inside that `CborValue` so that we don't pollute the global
state of the reader.
In its place, put an arbitrary `void *` pointer for reader context.  The
reader needs to store some context information which is specific to the
`CborParser` instance it is serving.  Right now, `CborValue::source::token`
serves this purpose, but the problem is that we also need a
per-`CborValue` context and have nowhere to put it.

Better to spend an extra pointer (4 bytes on 32-bit platforms) in the
`CborParser` (which there'll be just one of), then to do it in the
`CborValue` (which there may be several of) or to use a `CborReader`
object that itself carries two pointers (`ops` and the context, thus
we'd need an extra 3 pointers).
We simplify this reader in two ways:
1. we remove the `consumed` member of `struct Input`, and instead use
   the `CborValue`'s `source.token` member, which we treat as an
   unsigned integer offset into our `QByteArray`.
2. we replace the reader-specific `struct Input` with the `QByteArray`
   it was wrapping, since that's the only thing now contained in our
   `struct Input`.

If a `CborValue` gets cloned, the pointer referred to by `source.token`
similarly gets cloned, thus when we advance the pointer on the clone, it
leaves the original alone, so computing the length of unknown-length
entities in the CBOR document can be done safely.
What is not known, is what the significance is of
`CborEncoderAppendType`.  It basically tells the writer the nature of
the data being written, but the default implementation ignores this and
just blindly appends it no matter what.

That raises the question of why it's important enough that the writer
function needs to know about it.
This reads a CBOR file piece-wise, seeking backward and forward through
the file if needed.  Some seeking can be avoided by tuning the block
size used in reads so that the read window shifts by smaller amounts.
@sjlongland
Copy link
Author

This pull request is superseded by intel#208

@sjlongland sjlongland closed this Sep 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants