Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine the code structure for percentile rank #13

Open
wants to merge 2 commits into
base: percentile-rank-kernel
Choose a base branch
from

Conversation

zanmato1984
Copy link

@zanmato1984 zanmato1984 commented Jan 17, 2025

This refinement is based on facts that:

  1. "Sort and mark duplicate" ONLY concerns:
    1. The shape of the input (array vs chunked array);
    2. The input data type and physical type;
    3. If duplicates are needed (can be simply deduced from the function options).
  2. "Create rankings" ONLY concerns:
    1. The NullPartitionResult generated by "sort and mark duplicate";
    2. The ranking type (ordinal vs percentile).

Therefore we can decouple these two operations from the current Ranker, and only combine them in the concrete RankXXXMetaFunctions, which themselves can be CRTP-ed.

This refinement:

  1. Maintains the functionality and performance;
  2. Totally avoids virtual functions;
  3. Doesn't introduce more code explosion than before;
  4. Doesn't increase the complexity (IMHO it might actually reduce the complexity a bit).

Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:


Result<NullPartitionResult> Run() {
RETURN_NOT_OK(physical_type_->Accept(this));
return std::move(sorted_);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this std::move is needed because of the implicit construction of Result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant