Handling of short Decimal types from Arrow to Velox causing potential data correctness issue? #9412
-
ContextI am working on converting decimal types from Apache Arrow's Decimal128 (which uses 128 bits for all decimals) to Velox's representation, where Decimals are categorized into short (int64_t) and long (int128_t) decimals. I encountered a piece of code that extends the short decimal from 64 bits to 128 bits when transferring data from Arrow's memory format to Velox. // https://github.com/facebookincubator/velox/blob/main/velox/vector/arrow/Bridge.cpp#L428
if (type.isShortDecimal()) {
rows.apply([&](vector_size_t i) {
int128_t value = buf.as<int64_t>()[i];
memcpy(dst + (j++) * sizeof(int128_t), &value, sizeof(int128_t));
}); This code expands 64-bit decimals to 128-bit and stores them in Arrow's memory format. However, for reading the data back from ArrowArray to a Velox's vector, the code seems to use: // https://github.com/facebookincubator/velox/blob/main/velox/vector/arrow/Bridge.cpp#L1769
auto values = wrapInBufferView(
arrowArray.buffers[1], arrowArray.length * type->cppSizeInBytes()); Where cppSizeInBytes() refers to the Arrow type's byte size, which should be 64 bits for short decimals. Questions
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
@zdx19981006 I see that ArrowArray to VeloxVector for short decimals is handled here https://github.com/facebookincubator/velox/blob/main/velox/vector/arrow/Bridge.cpp#L1738
The maximum precision of a short decimal is 18. The max decimal value with precision 18 |
Beta Was this translation helpful? Give feedback.
-
Hi @majetideepak, I just wanted to express my gratitude for addressing the issue I encountered. I've seen the pull request #8957 and am thrilled to learn the bug has been fixed in the recent updates. Thanks again for your swift response and the fantastic work! Best regards |
Beta Was this translation helpful? Give feedback.
@zdx19981006 I see that ArrowArray to VeloxVector for short decimals is handled here https://github.com/facebookincubator/velox/blob/main/velox/vector/arrow/Bridge.cpp#L1738
createShortDecimalVector
converts a 128-bit value to a 64-bit value.The maximum precision of a short decimal is 18. The max decimal value with precision 18
999.. 99
will fit in a 64 bit value.If the ArrowArray has pre…