Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reenable huge pages for arrow host copying #17097

Merged

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Oct 15, 2024

Description

It is unclear whether the performance gains here are entirely from huge pages themselves or whether invoking madvise with huge pages is primarily serving to trigger an eager population of the pages (huge or not). We attempted to provide alternate flags to madvise like MADV_WILLNEED and that was not sufficient to recover performance, so either huge pages themselves are doing something special or specifying huge pages is causing madvise to trigger a page migration that no other flag does. In any case, this change returns us to the performance before the switch to the C data interface, and this code is lifted straight out of our old implementation so I am comfortable making use of it and knowing that it is not problematic. We should explore further optimizations in this direction, though.

Resolves #17075.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vyasr vyasr added bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels Oct 15, 2024
@vyasr vyasr self-assigned this Oct 15, 2024
@vyasr vyasr requested a review from a team as a code owner October 15, 2024 23:07
@vyasr vyasr requested review from harrism and bdice October 15, 2024 23:07
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine (as it matches a previous code path). Can we add this interop to our benchmark suite?

Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@vyasr
Copy link
Contributor Author

vyasr commented Oct 16, 2024

Code looks fine (as it matches a previous code path). Can we add this interop to our benchmark suite?

Yes, I'll open an issue.

@vyasr
Copy link
Contributor Author

vyasr commented Oct 16, 2024

/merge

@rapids-bot rapids-bot bot merged commit f1cbbcc into rapidsai:branch-24.12 Oct 16, 2024
102 checks passed
@vyasr vyasr deleted the fix/arrow_interop_perf_clean branch October 16, 2024 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Large performance regression for cuDF / PyArrow conversions (to_arrow in particular)
3 participants