You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug concat, used by concat_batches, does not appear to allocate sufficient capacities when constructing the MutableArrayData. Concatenating records that contain lists of structs results in the following panic:
assertion failed: total_len <= bit_len
thread 'concat_test' panicked at 'assertion failed: total_len <= bit_len', /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-buffer-40.0.0/src/buffer/boolean.rs:55:9
stack backtrace:
0: rust_begin_unwind
at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:579:5
1: core::panicking::panic_fmt
at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:64:14
2: core::panicking::panic
at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:114:5
3: arrow_buffer::buffer::boolean::BooleanBuffer::new
at /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-buffer-40.0.0/src/buffer/boolean.rs:55:9
4: arrow_data::transform::_MutableArrayData::freeze::{{closure}}
at /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-40.0.0/src/transform/mod.rs:81:25
5: core::bool::<impl bool>::then
at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/bool.rs:71:24
6: arrow_data::transform::_MutableArrayData::freeze
at /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-40.0.0/src/transform/mod.rs:80:21
7: arrow_data::transform::MutableArrayData::freeze
at /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-40.0.0/src/transform/mod.rs:656:18
8: arrow_data::transform::_MutableArrayData::freeze
at /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-40.0.0/src/transform/mod.rs:74:37
9: arrow_data::transform::MutableArrayData::freeze
at /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-40.0.0/src/transform/mod.rs:656:18
10: arrow_data::transform::_MutableArrayData::freeze
at /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-40.0.0/src/transform/mod.rs:74:37
11: arrow_data::transform::MutableArrayData::freeze
at /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-40.0.0/src/transform/mod.rs:656:18
12: arrow_data::transform::_MutableArrayData::freeze
at /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-40.0.0/src/transform/mod.rs:74:37
13: arrow_data::transform::MutableArrayData::freeze
at /Users/x/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-40.0.0/src/transform/mod.rs:656:18
To Reproduce
Call concat_batches with RecordBatchs that contain lists of structs (on the order of 20–50 structs in the list per RecordBatch). If I modify the capacity calculation in concat to add a constant factor for lists, the error does not occur:
let capacity = match d {DataType::Utf8 => binary_capacity::<Utf8Type>(arrays),DataType::LargeUtf8 => binary_capacity::<LargeUtf8Type>(arrays),DataType::Binary => binary_capacity::<BinaryType>(arrays),DataType::LargeBinary => binary_capacity::<LargeBinaryType>(arrays),DataType::List(_) => {Capacities::Array(arrays.iter().map(|a| a.len()).sum::<usize>() + 500)// <- 500 added here}
_ => Capacities::Array(arrays.iter().map(|a| a.len()).sum()),};
Expected behavior
No panics when concatenating RecordBatchs with lists.
Additional context
Reproduced with Arrow versions 37–40.
The text was updated successfully, but these errors were encountered:
joshg-ec
changed the title
concat_batches fails with total_len <= bit_len assertion for records with lists
concat_batches panics with total_len <= bit_len assertion for records with lists
May 31, 2023
Possibly related to #1230. The error would suggest that the validity buffer is not the correct length. I'll take a look in a bit, MutableArrayData is overdue some TLC in this regard (#1225)
Describe the bug
concat
, used byconcat_batches
, does not appear to allocate sufficientcapacities
when constructing theMutableArrayData
. Concatenating records that contain lists of structs results in the following panic:To Reproduce
Call
concat_batches
withRecordBatch
s that contain lists of structs (on the order of 20–50 structs in the list perRecordBatch
). If I modify the capacity calculation in concat to add a constant factor for lists, the error does not occur:Expected behavior
No panics when concatenating
RecordBatch
s with lists.Additional context
Reproduced with Arrow versions 37–40.
The text was updated successfully, but these errors were encountered: