-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add companionFunction to function metadata #9250
Conversation
✅ Deploy Preview for meta-velox canceled.
|
bfea9ca
to
3137659
Compare
@pramodsatya : How do we take care that we get only Prestissimo functions from the registry ? If there is a conflict between a Prestissimo and Spark function with the same name, then how do we disambiguate ? |
Thanks for the feedback @aditi-pandit, @czentgr. We are not checking that only Prestissimo functions are retrieved from the registry. Instead we rely on the Prestissimo worker to have registered only the presto functions, such as in this function, so there are no Spark functions in the registry. Please let me know if this is fine or whether we should have an additional way to distinguish between Presto and Spark functions in the registry. |
c2011b1
to
8a28ec9
Compare
8a28ec9
to
460e5db
Compare
460e5db
to
3eea7ab
Compare
Hi @aditi-pandit, @czentgr, could you please take another look at this PR? |
3eea7ab
to
29290d4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @aditi-pandit, addressed the comments. Could you please take another look?
29290d4
to
8b822aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pramodsatya
8b822aa
to
4f11b42
Compare
4f11b42
to
3013349
Compare
0b51694
to
0dccd1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good now.
@@ -413,10 +423,11 @@ bool CompanionFunctionsRegistrar::registerMergeExtractFunction( | |||
|
|||
auto mergeExtractFunctionName = | |||
CompanionSignatures::mergeExtractFunctionName(name); | |||
return registerAggregateFunction( | |||
return registerMergeExtractFunctionImpl( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to registerMergeExtractFunctionInternal
.
@pramodsatya. @czentgr : This is a reasonable solution as well, if we are okay exposing companion function concept to the services using Velox. With the other API to return all functions with a boolean parameter to say 'skipInternalFunctions' a single function could be used to skip all companion or any other internal functions added by the Velox framework. @mbasmanova : wdyt ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pramodsatya Pramod, would you update PR description to explain this change?
0dccd1c
to
64e6cc8
Compare
Apologies @mbasmanova, since this PR went through many iterations I had left it empty until the related discussion was concluded. I have updated the description now, could you please take another look? |
Can a window function be a companion function? I assume not. If so, let's remove isCompanion function from window function metadata. |
velox/expression/FunctionMetadata.h
Outdated
@@ -59,6 +62,11 @@ class VectorFunctionMetadataBuilder { | |||
return *this; | |||
} | |||
|
|||
VectorFunctionMetadataBuilder& isCompanionFunction(bool isCompanionFunction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming: drop 'is';
builder.companionFunction(true)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming: drop 'is';
velox/exec/Aggregate.cpp
Outdated
if (auto func = getAggregateFunctionEntry(sanitizedName)) { | ||
return func->metadata; | ||
} else { | ||
VELOX_USER_FAIL("Metadata not found for aggregate function: {}", name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue here is that aggregate function doesn't exist, not that it is missing metadata. Let's clarify.
VELOX_USER_FAIL("Aggregate function not found: {}", name);
velox/exec/Aggregate.cpp
Outdated
const auto sanitizedName = sanitizeName(name); | ||
if (auto func = getAggregateFunctionEntry(sanitizedName)) { | ||
return func->metadata; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: drop else after return
velox/exec/WindowFunction.h
Outdated
@@ -43,14 +43,15 @@ class WindowFunction { | |||
kRows, | |||
}; | |||
|
|||
/// Indicates whether this is an aggregate window function and its process | |||
/// unit. | |||
/// Indicates whether this is an aggregate window function, whether it is a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
...whether the corresponding aggregate function is a companion function....
velox/exec/WindowFunction.h
Outdated
struct Metadata { | ||
ProcessMode processMode; | ||
bool isAggregate; | ||
bool isCompanionFunction; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, clarify that this can be true iff isAggregate is true.
velox/expression/FunctionMetadata.h
Outdated
@@ -40,6 +40,9 @@ struct VectorFunctionMetadata { | |||
/// In this case, 'rows' in VectorFunction::apply will point only to positions | |||
/// for which all arguments are not null. | |||
bool defaultNullBehavior{true}; | |||
|
|||
/// Indicates if this is a companion function. | |||
bool isCompanionFunction{false}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consistency: drop 'is' to math other booleans here
velox/exec/Aggregate.h
Outdated
@@ -476,6 +476,9 @@ struct AggregateFunctionMetadata { | |||
/// True if results of the aggregation depend on the order of inputs. For | |||
/// example, array_agg is order sensitive while count is not. | |||
bool orderSensitive{true}; | |||
|
|||
/// Indicates if this is a companion function. | |||
bool isCompanionFunction{false}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consistency: drop 'is' to math the other boolean
@@ -75,7 +75,7 @@ void registerRowNumber(const std::string& name, TypeKind resultTypeKind) { | |||
exec::registerWindowFunction( | |||
name, | |||
std::move(signatures), | |||
{exec::WindowFunction::ProcessMode::kRows, false}, | |||
{exec::WindowFunction::ProcessMode::kRows, false, false}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem right to allow specifying isCompanion for a window function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted, thanks.
@@ -142,7 +142,7 @@ void registerAverageAggregate( | |||
} | |||
} | |||
}, | |||
{false /*orderSensitive*/}, | |||
{false /*orderSensitive*/, false /*isCompanionFunction*/}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is strange to have an API that has both isCompanion and withCompanionFunctions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it seems a bit strange, but isCompanionFunction
is a part of the function metadata which indicates if the function being registered currently is a companion function, whereas withCompanionFunctions
indicates if the aggregate should be registered with companion functions. So both these fields would be needed.
Could you please share how this could be made more readable?
@@ -356,6 +357,23 @@ TEST_F(FunctionRegistryTest, isDeterministic) { | |||
ASSERT_FALSE(isDeterministic("not_found_function").has_value()); | |||
} | |||
|
|||
TEST_F(FunctionRegistryTest, isCompanionFunction) { | |||
functions::prestosql::registerAllScalarFunctions(); | |||
// extract aggregate companion functions are registered as vector functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typos
64e6cc8
to
6b91785
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @mbasmanova. The companion function metadata added for window functions is now reverted and a check is added in registerAggregateFunction
to ensure aggregate companion functions are no longer registered as window functions.
Could you please take another look?
@@ -75,7 +75,7 @@ void registerRowNumber(const std::string& name, TypeKind resultTypeKind) { | |||
exec::registerWindowFunction( | |||
name, | |||
std::move(signatures), | |||
{exec::WindowFunction::ProcessMode::kRows, false}, | |||
{exec::WindowFunction::ProcessMode::kRows, false, false}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted, thanks.
@@ -142,7 +142,7 @@ void registerAverageAggregate( | |||
} | |||
} | |||
}, | |||
{false /*orderSensitive*/}, | |||
{false /*orderSensitive*/, false /*isCompanionFunction*/}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it seems a bit strange, but isCompanionFunction
is a part of the function metadata which indicates if the function being registered currently is a companion function, whereas withCompanionFunctions
indicates if the aggregate should be registered with companion functions. So both these fields would be needed.
Could you please share how this could be made more readable?
@pedroerp : Pinged you about this review as well. Would you be able to help us to close this ? Appreciate it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Adding ready-to-merge label post Masha's review. |
6b91785
to
031c009
Compare
@bikramSingh91 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@@ -357,6 +358,23 @@ TEST_F(FunctionRegistryTest, isDeterministic) { | |||
ASSERT_FALSE(isDeterministic("not_found_function").has_value()); | |||
} | |||
|
|||
TEST_F(FunctionRegistryTest, companionFunction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is failing internally with the following crash stacktrace:
Will dig into the issue and report back.
*** Aborted at 1738204790 (Unix time, try 'date -d @1738204790') ***
*** Signal 6 (SIGABRT) (0x560a0007ff20) received by PID 524064 (pthread TID 0x7fd3b6f5bd40) (linux TID 524064) (maybe from PID 524064, UID 22026) (code: -6), stack trace: ***
@ 000000000000f607 folly::symbolizer::(anonymous namespace)::innerSignalHandler(int, siginfo_t*, void*)
./fbcode/folly/debugging/symbolizer/SignalHandler.cpp:453
@ 000000000000dd81 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
./fbcode/folly/debugging/symbolizer/SignalHandler.cpp:474
@ 000000000004455f (unknown)
@ 000000000009c993 pthread_kill
@ 00000000000444ac raise
@ 000000000002c432 abort
@ 000000000044e7bb std::__replacement_assert(char const*, int, char const*, char const*)
fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/x86_64-facebook-linux/bits/c++config.h:535
@ 00000000007271df std::_Optional_base_impl<facebook::velox::exec::VectorFunctionMetadata, std::_Optional_base<facebook::velox::exec::VectorFunctionMetadata, true, true> >::_M_get()
fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/optional:440
@ 0000000000727144 std::optional<facebook::velox::exec::VectorFunctionMetadata>::operator->()
fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/optional:919
@ 000000000042155f facebook::velox::FunctionRegistryTest_companionFunction_Test::TestBody()
./fbcode/velox/functions/tests/FunctionRegistryTest.cpp:371
@ 00000000001211be void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)
fbsource/src/gtest.cc:2675
@ 0000000000120a44 testing::Test::Run()
fbsource/src/gtest.cc:2692
@ 000000000012668f testing::TestInfo::Run()
fbsource/src/gtest.cc:2841
@ 000000000012e646 testing::TestSuite::Run()
fbsource/src/gtest.cc:3020
@ 0000000000169fab testing::internal::UnitTestImpl::RunAllTests()
fbsource/src/gtest.cc:5925
@ 000000000016900b bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)
fbsource/src/gtest.cc:2675
@ 0000000000168549 testing::UnitTest::Run()
fbsource/src/gtest.cc:5489
@ 0000000000002980 RUN_ALL_TESTS()
fbsource/gtest/gtest.h:2317
-> ./fbcode/common/gtest/LightMain.cpp
@ 000000000000260f main
./fbcode/common/gtest/LightMain.cpp:20
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, thanks @bikramSingh91.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like for the functions you used "array_frequency", "bitwise_left_shift", "ceil"
are simple functions and are registered in another registry so exec::getVectorFunctionMetadata(function) returns a nullopt in this case which then crashes with a segfault when the test tries to deref it to access companionFunction
.
I am not sure why its does not fail in the pre-commit tests, I also dont see velox_function_registry_test being run in the CI job. @kgpai any idea why that might be happening?
Can you check if this test runs on your local? wanna make sure that it fails
to fix this you can either use some vector functions instead or use the simple function registry to fetch their metadata, see (
velox/velox/functions/FunctionRegistry.cpp
Lines 80 to 82 in dcafd32
const auto simpleFunctions = | |
exec::simpleFunctions().getFunctionSignaturesAndMetadata(functionName); | |
const auto metadata = exec::getVectorFunctionMetadata(functionName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out, updated accordingly. Interestingly, the test runs locally both with exec::getVectorFunctionMetadata
, and with the simple function registry lookup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pramodsatya Thanks for the update, main branch seems to be broken and the fix (#12223) is in works, do you mind waiting for it to merge and then rebase this change so we can get a clean CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pramodsatya the build has been fixed, do you mind rebasing the PR so I can import it again? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @bikramSingh91, rebased the PR.
031c009
to
e060f27
Compare
e060f27
to
01de22d
Compare
@bikramSingh91 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@bikramSingh91 merged this pull request in 2bb7aab. |
Adds a boolean field,
isCompanionFunction
, toVectorFunctionMetadata
,AggregateFunctionMetadata
, andWindowFunction::Metadata
, to indicatewhether the respective scalar, aggregate, and window functions are companion
functions in Velox.
This field would be used to check for and exclude companion functions from
the function metadata returned by the
v1/functions
endpoint in the PrestoC++ sidecar. Currently, this is being done by searching for specific suffixes in
the registered companion functions' names.
Related discussion: #11011 .