Skip to content

Commit

Permalink
GH-43719: [C++] Clarify the way SIMD-enabled agg kernels come from th…
Browse files Browse the repository at this point in the history
…e same code in different compilation units (#43720)

### Rationale for this change

More than once I've been confused about how the `SimdLevel` template parameters on these kernel classes affect dispatching of kernels based on SIMD support detection at runtime [1] given that nothing in the code changes based on the parameters.

What matters is the compilation unit in which the templates are instantiated. Different compilation units get different compilation parameters. The SimdLevel parameters don't really affect the code that gets generated (!), they only serve as a way to avoid duplication of symbols in the compiled objects.

This PR organizes the code to make this more explicit.

[1] #7871 (comment)

### What changes are included in this PR?

 - Introduction of aggregate_basic-inl.h
 - Moving of the impls in `aggregate_basic-inl.h` to an anonymous namespace
 - Grouping of code based on the function they implement (`Sum`, `Mean`, and `MinMax`)

### Are these changes tested?

By the compilation process, existing tests, and benchmarks.

* GitHub Issue: #43719

Lead-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Co-authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
  • Loading branch information
felipecrv and pitrou authored Sep 3, 2024
1 parent 57cc0b9 commit 6ce2af7
Show file tree
Hide file tree
Showing 5 changed files with 1,125 additions and 1,036 deletions.
37 changes: 21 additions & 16 deletions cpp/src/arrow/compute/kernels/aggregate_basic.cc
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,9 @@
#include "arrow/util/cpu_info.h"
#include "arrow/util/hashing.h"

#include <memory>
// Include templated definitions for aggregate kernels that must compiled here
// with the SIMD level configured for this compilation unit in the build.
#include "arrow/compute/kernels/aggregate_basic.inc.cc" // NOLINT(build/include)

namespace arrow {
namespace compute {
Expand Down Expand Up @@ -276,11 +278,6 @@ struct SumImplDefault : public SumImpl<ArrowType, SimdLevel::NONE> {
using SumImpl<ArrowType, SimdLevel::NONE>::SumImpl;
};

template <typename ArrowType>
struct MeanImplDefault : public MeanImpl<ArrowType, SimdLevel::NONE> {
using MeanImpl<ArrowType, SimdLevel::NONE>::MeanImpl;
};

Result<std::unique_ptr<KernelState>> SumInit(KernelContext* ctx,
const KernelInitArgs& args) {
SumLikeInit<SumImplDefault> visitor(
Expand All @@ -289,6 +286,14 @@ Result<std::unique_ptr<KernelState>> SumInit(KernelContext* ctx,
return visitor.Create();
}

// ----------------------------------------------------------------------
// Mean implementation

template <typename ArrowType>
struct MeanImplDefault : public MeanImpl<ArrowType, SimdLevel::NONE> {
using MeanImpl<ArrowType, SimdLevel::NONE>::MeanImpl;
};

Result<std::unique_ptr<KernelState>> MeanInit(KernelContext* ctx,
const KernelInitArgs& args) {
MeanKernelInit<MeanImplDefault> visitor(
Expand Down Expand Up @@ -482,8 +487,8 @@ void AddFirstOrLastAggKernel(ScalarAggregateFunction* func,
// ----------------------------------------------------------------------
// MinMax implementation

Result<std::unique_ptr<KernelState>> MinMaxInit(KernelContext* ctx,
const KernelInitArgs& args) {
Result<std::unique_ptr<KernelState>> MinMaxInitDefault(KernelContext* ctx,
const KernelInitArgs& args) {
ARROW_ASSIGN_OR_RAISE(TypeHolder out_type,
args.kernel->signature->out_type().Resolve(ctx, args.inputs));
MinMaxInitState<SimdLevel::NONE> visitor(
Expand Down Expand Up @@ -1114,14 +1119,14 @@ void RegisterScalarAggregateBasic(FunctionRegistry* registry) {
// Add min max function
func = std::make_shared<ScalarAggregateFunction>("min_max", Arity::Unary(), min_max_doc,
&default_scalar_aggregate_options);
AddMinMaxKernels(MinMaxInit, {null(), boolean()}, func.get());
AddMinMaxKernels(MinMaxInit, NumericTypes(), func.get());
AddMinMaxKernels(MinMaxInit, TemporalTypes(), func.get());
AddMinMaxKernels(MinMaxInit, BaseBinaryTypes(), func.get());
AddMinMaxKernel(MinMaxInit, Type::FIXED_SIZE_BINARY, func.get());
AddMinMaxKernel(MinMaxInit, Type::INTERVAL_MONTHS, func.get());
AddMinMaxKernel(MinMaxInit, Type::DECIMAL128, func.get());
AddMinMaxKernel(MinMaxInit, Type::DECIMAL256, func.get());
AddMinMaxKernels(MinMaxInitDefault, {null(), boolean()}, func.get());
AddMinMaxKernels(MinMaxInitDefault, NumericTypes(), func.get());
AddMinMaxKernels(MinMaxInitDefault, TemporalTypes(), func.get());
AddMinMaxKernels(MinMaxInitDefault, BaseBinaryTypes(), func.get());
AddMinMaxKernel(MinMaxInitDefault, Type::FIXED_SIZE_BINARY, func.get());
AddMinMaxKernel(MinMaxInitDefault, Type::INTERVAL_MONTHS, func.get());
AddMinMaxKernel(MinMaxInitDefault, Type::DECIMAL128, func.get());
AddMinMaxKernel(MinMaxInitDefault, Type::DECIMAL256, func.get());
// Add the SIMD variants for min max
#if defined(ARROW_HAVE_RUNTIME_AVX2)
if (cpu_info->IsSupported(arrow::internal::CpuInfo::AVX2)) {
Expand Down
Loading

0 comments on commit 6ce2af7

Please sign in to comment.