Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix sequence compression API in Explicit Delimiter mode #3023

Merged
merged 2 commits into from
Jan 26, 2022

Conversation

Cyan4973
Copy link
Contributor

@Cyan4973 Cyan4973 commented Jan 22, 2022

In Explicit Delimiter mode, the Sequence compression API was only working when
the frame has only a single block
or when all blocks are full (except the last one).

This fix makes is possible to select block boundaries wherever the caller wants them.
It still requires that data conforms to zstd specification, aka block sizes still have a maximum.

The fuzzer capability has been extended to test the explicit delimiter mode too.

@Cyan4973 Cyan4973 force-pushed the fix_seqCompress_withDelimiter branch 2 times, most recently from 0f2816a to 55f5946 Compare January 24, 2022 17:50
@terrelln
Copy link
Contributor

Looks like theres an assert in the tests/fuzz/sequence_compression_api.c fuzzer that needs to be fixed.

Copy link
Contributor

@terrelln terrelln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the tests pass

@@ -27,7 +27,7 @@ const char* ERR_getErrorString(ERR_enum code)
case PREFIX(version_unsupported): return "Version not supported";
case PREFIX(frameParameter_unsupported): return "Unsupported frame parameter";
case PREFIX(frameParameter_windowTooLarge): return "Frame requires too much memory for decoding";
case PREFIX(corruption_detected): return "Corrupted block detected";
case PREFIX(corruption_detected): return "Input corruption detected";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe "Data corruption detected"? It feels more natural than "Input corruption"

@Cyan4973 Cyan4973 force-pushed the fix_seqCompress_withDelimiter branch from 55f5946 to 87fb8a5 Compare January 25, 2022 21:34
cctx->blockSize, remaining,
inSeqs, inSeqsSize, seqPos);
U32 const lastBlock = (blockSize == remaining);
assert(blockSize <= remaining);
Copy link
Contributor

@terrelln terrelln Jan 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this assert is failing.

Maybe it just needs to be moved below the FORWARD_IF_ERROR() check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's exactly the issue,
blockSize can be an error code, so this assert() is only valid after the FORWARD_IF_ERROR().

@Cyan4973 Cyan4973 force-pushed the fix_seqCompress_withDelimiter branch 2 times, most recently from c89e7c8 to 6096d33 Compare January 26, 2022 07:33
add explicit delimiter mode to libfuzzer test
@Cyan4973 Cyan4973 force-pushed the fix_seqCompress_withDelimiter branch from 6096d33 to fc2ea97 Compare January 26, 2022 08:29
@Cyan4973
Copy link
Contributor Author

The misplaced assert() was the only issue affecting the library.
The bulk of the work (and changes) was in the fuzz tester itself, in order to make it compatible with the Explicit Delimiter mode. As a nice side-effect, it allowed observing that the library was correctly rejecting all sort of bad conditions, from incorrect sequences, to invalid blocks, to incomplete frames, etc.
So I think this is finally good to go.

@Cyan4973 Cyan4973 merged commit a0acf9a into dev Jan 26, 2022
@Cyan4973 Cyan4973 deleted the fix_seqCompress_withDelimiter branch January 13, 2023 04:28
@Cyan4973 Cyan4973 mentioned this pull request Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants