Skip to content

Commit

Permalink
Merge pull request #4079 from elasota/truncated-huff-state-error
Browse files Browse the repository at this point in the history
Throw error if Huffman weight initial states are truncated
  • Loading branch information
Cyan4973 authored Jun 30, 2024
2 parents 4fe0ba0 + 0938308 commit 3de0541
Show file tree
Hide file tree
Showing 4 changed files with 26 additions and 0 deletions.
20 changes: 20 additions & 0 deletions doc/decompressor_permissive.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,26 @@ This document lists a few known cases where invalid data was formerly accepted
by the decoder, and what has changed since.


Truncated Huffman states
------------------------

**Last affected version**: v1.5.6

**Produced by the reference compressor**: No

**Example Frame**: `28b5 2ffd 0000 5500 0072 8001 0420 7e1f 02aa 00`

When using FSE-compressed Huffman weights, the compressed weight bitstream
could contain fewer bits than necessary to decode the initial states.

The reference decompressor up to v1.5.6 will decode truncated or missing
initial states as zero, which can result in a valid Huffman tree if only
the second state is truncated.

In newer versions, truncated initial states are reported as a corruption
error by the decoder.


Offset == 0
-----------

Expand Down
4 changes: 4 additions & 0 deletions doc/zstd_compression_format.md
Original file line number Diff line number Diff line change
Expand Up @@ -1362,6 +1362,10 @@ symbols for each of the final states are decoded and the process is complete.
If this process would produce more weights than the maximum number of decoded
weights (255), then the data is considered corrupted.

If either of the 2 initial states are absent or truncated, then the data is
considered corrupted. Consequently, it is not possible to encode fewer than
2 weights using this mode.

#### Conversion from weights to Huffman prefix codes

All present symbols shall now have a `Weight` value.
Expand Down
2 changes: 2 additions & 0 deletions lib/common/fse_decompress.c
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,8 @@ FORCE_INLINE_TEMPLATE size_t FSE_decompress_usingDTable_generic(
FSE_initDState(&state1, &bitD, dt);
FSE_initDState(&state2, &bitD, dt);

RETURN_ERROR_IF(BIT_reloadDStream(&bitD)==BIT_DStream_overflow, corruption_detected, "");

#define FSE_GETSYMBOL(statePtr) fast ? FSE_decodeSymbolFast(statePtr, &bitD) : FSE_decodeSymbol(statePtr, &bitD)

/* 4 symbols per loop */
Expand Down
Binary file not shown.

0 comments on commit 3de0541

Please sign in to comment.