You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
It seems the current codes lack sanity checks on metadata, making it vulnerable to corrupted files. The following gives a few example:
There are a bunch ofi32 as u32 without checking if the i32 is negative. Unfortunately these u32 may be used to guide buffer allocations (e.g., here) when reading data.
Using the read_parquet example to read the two bad files from here. Reading bad-dict-page-header.parquet may give an EOF error but internally the library already has called a Vec::reserve(N) where N is from a negative i32. Reading bad-levels.parquet would simply stuck in infinite loop.
Expected behavior
Examine the metadata and return proper errors
Additional context
The text was updated successfully, but these errors were encountered:
Describe the bug
It seems the current codes lack sanity checks on metadata, making it vulnerable to corrupted files. The following gives a few example:
There are a bunch of
i32 as u32
without checking if thei32
is negative. Unfortunately theseu32
may be used to guide buffer allocations (e.g., here) when reading data.The
read_records
does not validate the levels_read from read_rep_levels. A corrupted file may cause theread_rep_levels
return 0, which could lead to infinite loop,To Reproduce
Using the
read_parquet
example to read the two bad files from here. Readingbad-dict-page-header.parquet
may give an EOF error but internally the library already has called aVec::reserve(N)
where N is from a negative i32. Readingbad-levels.parquet
would simply stuck in infinite loop.Expected behavior
Examine the metadata and return proper errors
Additional context
The text was updated successfully, but these errors were encountered: