Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-32570: [C++] Fix the issue of ExecBatchBuilder when appending consecutive tail rows with the same id may exceed buffer boundary #39234

Merged
merged 3 commits into from
Dec 21, 2023

Conversation

zanmato1984
Copy link
Contributor

@zanmato1984 zanmato1984 commented Dec 14, 2023

Rationale for this change

Addressed in #32570 (comment)

What changes are included in this PR?

  1. Skip consecutive rows with the same id when calculating rows to skip when appending to ExecBatchBuilder.
  2. Fix the bug that column offset is neglected when calculating rows to skip.

Are these changes tested?

Yes. New UT included and the change is also protected by the existing case mentioned in the issue.

Are there any user-facing changes?

No.

This PR contains a "Critical Fix".

Because #32570 is labeled critical, and causes a crash even when the API contract is upheld.

…ows with the same id may exceed buffer boundary
Copy link

⚠️ GitHub issue #32570 has been automatically assigned in GitHub to PR creator.

@zanmato1984
Copy link
Contributor Author

@github-actions crossbow submit test-ubuntu-20.04-cpp

Copy link

Revision: 8edc1d1

Submitted crossbow builds: ursacomputing/crossbow @ actions-d19754bb45

Task Status
test-ubuntu-20.04-cpp GitHub Actions

@zanmato1984
Copy link
Contributor Author

@github-actions crossbow submit test-macos-12-cpp

Copy link

Unable to match any tasks for `test-macos-12-cpp`
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/7225815149

@zanmato1984
Copy link
Contributor Author

@github-actions crossbow submit test-macos-cpp

Copy link

Unable to match any tasks for `test-macos-cpp`
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/7225892739

@zanmato1984
Copy link
Contributor Author

@github-actions crossbow submit cpp

Copy link

Unable to match any tasks for `cpp`
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/7226222452

@zanmato1984
Copy link
Contributor Author

@github-actions crossbow submit test

Copy link

Unable to match any tasks for `test`
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/7226270964

@zanmato1984
Copy link
Contributor Author

@github-actions crossbow submit -g cpp

Copy link

Revision: 8edc1d1

Submitted crossbow builds: ursacomputing/crossbow @ actions-ac27092881

Task Status
test-alpine-linux-cpp GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind Azure
test-cuda-cpp GitHub Actions
test-debian-11-cpp-amd64 GitHub Actions
test-debian-11-cpp-i386 GitHub Actions
test-fedora-35-cpp GitHub Actions
test-ubuntu-20.04-cpp GitHub Actions
test-ubuntu-20.04-cpp-20 GitHub Actions
test-ubuntu-20.04-cpp-bundled GitHub Actions
test-ubuntu-20.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-20.04-cpp-thread-sanitizer GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions

@assignUser assignUser requested a review from pitrou December 15, 2023 22:30
@assignUser assignUser added the Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data. label Dec 15, 2023
@zanmato1984
Copy link
Contributor Author

I wanted to retry the CI because the failures don't seem to be related to my change. I don't know how so I tried some commands but obviously they are not what I wanted :(

@pitrou
Copy link
Member

pitrou commented Dec 20, 2023

cc @bkietz

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Dec 21, 2023
@pitrou pitrou merged commit 2abb3fb into apache:main Dec 21, 2023
37 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Dec 21, 2023
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 2abb3fb.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

pitrou pushed a commit that referenced this pull request Jan 17, 2024
…ecutive tail rows with the same id may exceed buffer boundary (for fixed size types) (#39585)

### Rationale for this change

#39583 is a subsequent issue of #32570 (fixed by #39234). The last issue and fixed only resolved var length types. It turns out fixed size types have the same issue.

### What changes are included in this PR?

Do the same fix of #39234 for fixed size types.

### Are these changes tested?

UT included.

### Are there any user-facing changes?

* Closes: #39583

Authored-by: zanmato1984 <zanmato1984@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
idailylife pushed a commit to idailylife/arrow that referenced this pull request Jan 18, 2024
…g consecutive tail rows with the same id may exceed buffer boundary (for fixed size types) (apache#39585)

### Rationale for this change

apache#39583 is a subsequent issue of apache#32570 (fixed by apache#39234). The last issue and fixed only resolved var length types. It turns out fixed size types have the same issue.

### What changes are included in this PR?

Do the same fix of apache#39234 for fixed size types.

### Are these changes tested?

UT included.

### Are there any user-facing changes?

* Closes: apache#39583

Authored-by: zanmato1984 <zanmato1984@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
clayburn pushed a commit to clayburn/arrow that referenced this pull request Jan 23, 2024
…ing consecutive tail rows with the same id may exceed buffer boundary (apache#39234)

### Rationale for this change

Addressed in apache#32570 (comment)

### What changes are included in this PR?

1. Skip consecutive rows with the same id when calculating rows to skip when appending to `ExecBatchBuilder`.
2. Fix the bug that column offset is neglected when calculating rows to skip.

### Are these changes tested?

Yes. New UT included and the change is also protected by the existing case mentioned in the issue.

### Are there any user-facing changes?

No.

**This PR contains a "Critical Fix".**

Because apache#32570 is labeled critical, and causes a crash even when the API contract is upheld.

* Closes: apache#32570

Authored-by: zanmato <zanmato1984@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
clayburn pushed a commit to clayburn/arrow that referenced this pull request Jan 23, 2024
…g consecutive tail rows with the same id may exceed buffer boundary (for fixed size types) (apache#39585)

### Rationale for this change

apache#39583 is a subsequent issue of apache#32570 (fixed by apache#39234). The last issue and fixed only resolved var length types. It turns out fixed size types have the same issue.

### What changes are included in this PR?

Do the same fix of apache#39234 for fixed size types.

### Are these changes tested?

UT included.

### Are there any user-facing changes?

* Closes: apache#39583

Authored-by: zanmato1984 <zanmato1984@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…ing consecutive tail rows with the same id may exceed buffer boundary (apache#39234)

### Rationale for this change

Addressed in apache#32570 (comment)

### What changes are included in this PR?

1. Skip consecutive rows with the same id when calculating rows to skip when appending to `ExecBatchBuilder`.
2. Fix the bug that column offset is neglected when calculating rows to skip.

### Are these changes tested?

Yes. New UT included and the change is also protected by the existing case mentioned in the issue.

### Are there any user-facing changes?

No.

**This PR contains a "Critical Fix".**

Because apache#32570 is labeled critical, and causes a crash even when the API contract is upheld.

* Closes: apache#32570

Authored-by: zanmato <zanmato1984@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…g consecutive tail rows with the same id may exceed buffer boundary (for fixed size types) (apache#39585)

### Rationale for this change

apache#39583 is a subsequent issue of apache#32570 (fixed by apache#39234). The last issue and fixed only resolved var length types. It turns out fixed size types have the same issue.

### What changes are included in this PR?

Do the same fix of apache#39234 for fixed size types.

### Are these changes tested?

UT included.

### Are there any user-facing changes?

* Closes: apache#39583

Authored-by: zanmato1984 <zanmato1984@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
raulcd pushed a commit that referenced this pull request Feb 20, 2024
…ecutive tail rows with the same id may exceed buffer boundary (for fixed size types) (#39585)

### Rationale for this change

#39583 is a subsequent issue of #32570 (fixed by #39234). The last issue and fixed only resolved var length types. It turns out fixed size types have the same issue.

### What changes are included in this PR?

Do the same fix of #39234 for fixed size types.

### Are these changes tested?

UT included.

### Are there any user-facing changes?

* Closes: #39583

Authored-by: zanmato1984 <zanmato1984@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
zanmato1984 added a commit to zanmato1984/arrow that referenced this pull request Feb 28, 2024
…g consecutive tail rows with the same id may exceed buffer boundary (for fixed size types) (apache#39585)

### Rationale for this change

apache#39583 is a subsequent issue of apache#32570 (fixed by apache#39234). The last issue and fixed only resolved var length types. It turns out fixed size types have the same issue.

### What changes are included in this PR?

Do the same fix of apache#39234 for fixed size types.

### Are these changes tested?

UT included.

### Are there any user-facing changes?

* Closes: apache#39583

Authored-by: zanmato1984 <zanmato1984@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
thisisnic pushed a commit to thisisnic/arrow that referenced this pull request Mar 8, 2024
…g consecutive tail rows with the same id may exceed buffer boundary (for fixed size types) (apache#39585)

### Rationale for this change

apache#39583 is a subsequent issue of apache#32570 (fixed by apache#39234). The last issue and fixed only resolved var length types. It turns out fixed size types have the same issue.

### What changes are included in this PR?

Do the same fix of apache#39234 for fixed size types.

### Are these changes tested?

UT included.

### Are there any user-facing changes?

* Closes: apache#39583

Authored-by: zanmato1984 <zanmato1984@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: C++ Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] Segmentation fault on arrow-compute-hash-join-node-test on macos nightlies
3 participants