Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set numRows for the ColumnBatch created in GpuBringBackToHost #11365

Merged
merged 5 commits into from
Aug 20, 2024

Conversation

jihoonson
Copy link
Collaborator

Fixes #11364.

Signed-off-by: Jihoon Son <ghoonson@gmail.com>
@sameerz sameerz added the bug Something isn't working label Aug 19, 2024
@jihoonson
Copy link
Collaborator Author

build

Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but a quick grep of the source shows there's some other cases where this is done incorrectly, e.g.:

  • TestingV2Source
  • CurrentBatchIterator
  • GpuBroadcastNestedLoopJoinExecBase (empty batch, probably OK)
  • GpuPartitionerExpr
  • GpuBloomFilterMightContain (empty batch, probably OK)
  • ImplicitTestsSuite
  • RapidsShuffleTestHelper

If not handling here, we should track these for followup.

@jihoonson
Copy link
Collaborator Author

jihoonson commented Aug 20, 2024

@jlowe I did have checked all other places except for those in the unit test assuming they are fine. AFAIT, those places in production that use the constructor without the row count have two patterns.

  1. They create an empty ColumnBatch. The row count is defaulted to 0, which is legit in this case.
  2. They set the row count after the ColumnBatch is created, such as in CurrentBatchIterator. I think it's better to use a "safer" constructor that explicitly takes the row count, but it is not a bug.

I think the real issue here is that the ColumnBatch constructor API is not safe as it is easy for human to make mistakes. An approach to solve that is forbidding the use of those unsafe APIs and adding safer wrappers around them as I proposed in here.

@jihoonson
Copy link
Collaborator Author

build

@jihoonson
Copy link
Collaborator Author

I think the real issue here is that the ColumnBatch constructor API is not safe as it is easy for human to make mistakes. An approach to solve that is forbidding the use of those unsafe APIs and adding safer wrappers around them as I proposed in here.

Filed the suggestion in #11369.

@jihoonson jihoonson merged commit 9fcd8c7 into NVIDIA:branch-24.10 Aug 20, 2024
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Missing numRows in the ColumnarBatch created in GpuBringBackToHost
3 participants