Fix Operator outputBatchRows may overflow #10868

jinchengchenghh · 2024-08-28T01:33:45Z

The computation of function outputBatchRows() may overflow, fix it. And refactor the relevant output batch size config from uint32_t to vector_size_t(int32_t) because the RowVector numRows type is vector_size_t.

netlify · 2024-08-28T01:34:01Z

✅ Deploy Preview for meta-velox canceled.

Name	Link
🔨 Latest commit	`fafcd0c`
🔍 Latest deploy log	https://app.netlify.com/sites/meta-velox/deploys/66d14ffc3d9777000886eafd

jinchengchenghh · 2024-08-28T05:27:24Z

@mbasmanova Can you help review this PR? Thanks!

xiaoxmeng

@jinchengchenghh thanks for the change.

xiaoxmeng · 2024-08-28T19:57:04Z

velox/core/QueryConfig.h

-  uint32_t preferredOutputBatchRows() const {
-    return get<uint32_t>(kPreferredOutputBatchRows, 1024);
+  int32_t preferredOutputBatchRows() const {
+    uint32_t batchRows = get<uint32_t>(kPreferredOutputBatchRows, 1024);


xiaoxmeng · 2024-08-28T19:57:09Z

velox/core/QueryConfig.h

-  uint32_t maxOutputBatchRows() const {
-    return get<uint32_t>(kMaxOutputBatchRows, 10'000);
+  int32_t maxOutputBatchRows() const {
+    uint32_t maxBatchRows = get<uint32_t>(kMaxOutputBatchRows, 10'000);


xiaoxmeng · 2024-08-28T19:59:34Z

velox/dwio/common/SortingWriter.cpp

  if (sortBuffer_->estimateOutputRowSize().has_value() &&
      sortBuffer_->estimateOutputRowSize().value() != 0) {
-    estimatedMaxOutputRows =
+    uint64_t maxOutputRows =


xiaoxmeng · 2024-08-28T20:01:52Z

velox/exec/Operator.cpp

+  if (UNLIKELY(batchSize > std::numeric_limits<vector_size_t>::max())) {
+    return std::numeric_limits<vector_size_t>::max();
+  }
+  return std::max<vector_size_t>(


const uint64_t batchSize = return std::max<vector_size_t>(batchSize, 1);

xiaoxmeng · 2024-08-28T20:05:47Z

velox/exec/Operator.cpp

  if (!averageRowSize.has_value()) {
    return queryConfig.preferredOutputBatchRows();
  }

  const uint64_t rowSize = averageRowSize.value();
-
-  if (rowSize * queryConfig.maxOutputBatchRows() <
+  uint64_t batchBytes;


if (queryConfig.preferredOutputBatchBytes() / rowSize > queryConfig.maxOutputBatchRows()) { return queryConfig.maxOutputBatchRows(); } return outputBatchRowsByBytes(queryConfig, rowSize);

xiaoxmeng · 2024-08-29T07:02:47Z

velox/exec/Operator.cpp

    return queryConfig.maxOutputBatchRows();
  }
-  return std::max<uint32_t>(
-      queryConfig.preferredOutputBatchBytes() / rowSize, 1);
+  return std::max<uint64_t>(batchSize, 1);


return std::max<vector_size_t>(batchSize, 1);

xiaoxmeng

@jinchengchenghh LGTM % minors. Thanks!

xiaoxmeng · 2024-08-29T07:05:17Z

velox/exec/tests/OperatorUtilsTest.cpp

+TEST_F(OperatorUtilsTest, outputBatchRows) {
+  RowTypePtr rowType = ROW({"c0"}, {INTEGER()});
+  {
+    setBatchConfig(10, 20, 234);


s/setBatchConfig/setTaskOutputBatchConfig/

xiaoxmeng

@jinchengchenghh LGTM. Thanks!

mbasmanova

@jinchengchenghh Thank you for the fix.

@xiaoxmeng Thank you for reviewing.

velox/core/QueryConfig.h

xiaoxmeng

@jinchengchenghh there is a test failure in CI. Thanks!

facebook-github-bot · 2024-08-30T04:57:04Z

@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-08-30T17:00:10Z

@xiaoxmeng merged this pull request in 4499332.

conbench-facebook · 2024-08-30T17:43:03Z

Conbench analyzed the 1 benchmark run on commit 4499332b.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Summary: The computation of function outputBatchRows() may overflow, fix it. And refactor the relevant output batch size config from uint32_t to vector_size_t(int32_t) because the RowVector numRows type is vector_size_t. Pull Request resolved: facebookincubator#10868 Reviewed By: gggrace14 Differential Revision: D62013297 Pulled By: xiaoxmeng fbshipit-source-id: 087b603967ff3666624e8d4c8b1a23c6130846f9

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 28, 2024

mbasmanova requested review from xiaoxmeng and Yuhta August 28, 2024 07:17

xiaoxmeng reviewed Aug 28, 2024

View reviewed changes

xiaoxmeng mentioned this pull request Aug 28, 2024

Fix SortBuffer batchSize computation overflow #10848

Closed

xiaoxmeng reviewed Aug 29, 2024

View reviewed changes

xiaoxmeng approved these changes Aug 29, 2024

View reviewed changes

xiaoxmeng requested a review from mbasmanova August 29, 2024 16:44

mbasmanova approved these changes Aug 29, 2024

View reviewed changes

velox/core/QueryConfig.h Outdated Show resolved Hide resolved

xiaoxmeng reviewed Aug 30, 2024

View reviewed changes

jinchengchenghh force-pushed the rowoverflow branch from aca6bc5 to fafcd0c Compare August 30, 2024 04:52

jinchengchenghh added 6 commits August 30, 2024 12:08

Fix Operator outputBatchRows may overflow

5e848d8

address comments

b52f19b

address comments

a1314cb

address comments

1bd5b98

address comments

5a6c4e8

fix integer divide by zero

fafcd0c

facebook-github-bot closed this in 4499332 Aug 30, 2024

facebook-github-bot added the Merged label Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Operator outputBatchRows may overflow #10868

Fix Operator outputBatchRows may overflow #10868

jinchengchenghh commented Aug 28, 2024 •

edited

Loading

netlify bot commented Aug 28, 2024 •

edited

Loading

jinchengchenghh commented Aug 28, 2024

xiaoxmeng left a comment

xiaoxmeng Aug 28, 2024

xiaoxmeng Aug 28, 2024

xiaoxmeng Aug 28, 2024

xiaoxmeng Aug 28, 2024

xiaoxmeng Aug 28, 2024

xiaoxmeng Aug 29, 2024

xiaoxmeng left a comment

xiaoxmeng Aug 29, 2024

xiaoxmeng left a comment

mbasmanova left a comment

xiaoxmeng left a comment

facebook-github-bot commented Aug 30, 2024

facebook-github-bot commented Aug 30, 2024

conbench-facebook bot commented Aug 30, 2024

Fix Operator outputBatchRows may overflow #10868

Fix Operator outputBatchRows may overflow #10868

Conversation

jinchengchenghh commented Aug 28, 2024 • edited Loading

netlify bot commented Aug 28, 2024 • edited Loading

✅ Deploy Preview for meta-velox canceled.

jinchengchenghh commented Aug 28, 2024

xiaoxmeng left a comment

Choose a reason for hiding this comment

xiaoxmeng Aug 28, 2024

Choose a reason for hiding this comment

xiaoxmeng Aug 28, 2024

Choose a reason for hiding this comment

xiaoxmeng Aug 28, 2024

Choose a reason for hiding this comment

xiaoxmeng Aug 28, 2024

Choose a reason for hiding this comment

xiaoxmeng Aug 28, 2024

Choose a reason for hiding this comment

xiaoxmeng Aug 29, 2024

Choose a reason for hiding this comment

xiaoxmeng left a comment

Choose a reason for hiding this comment

xiaoxmeng Aug 29, 2024

Choose a reason for hiding this comment

xiaoxmeng left a comment

Choose a reason for hiding this comment

mbasmanova left a comment

Choose a reason for hiding this comment

xiaoxmeng left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Aug 30, 2024

facebook-github-bot commented Aug 30, 2024

conbench-facebook bot commented Aug 30, 2024

jinchengchenghh commented Aug 28, 2024 •

edited

Loading

netlify bot commented Aug 28, 2024 •

edited

Loading