Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add companionFunction to function metadata #9250

Closed

Conversation

pramodsatya
Copy link
Collaborator

@pramodsatya pramodsatya commented Mar 26, 2024

Adds a boolean field, isCompanionFunction, to VectorFunctionMetadata,
AggregateFunctionMetadata, and WindowFunction::Metadata, to indicate
whether the respective scalar, aggregate, and window functions are companion
functions in Velox.
This field would be used to check for and exclude companion functions from
the function metadata returned by the v1/functions endpoint in the Presto
C++ sidecar. Currently, this is being done by searching for specific suffixes in
the registered companion functions' names.
Related discussion: #11011 .

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 26, 2024
Copy link

netlify bot commented Mar 26, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 01de22d
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/679e57fa477ce10008b5d5f7

velox/functions/CoverageUtil.h Outdated Show resolved Hide resolved
velox/functions/FunctionRegistry.h Outdated Show resolved Hide resolved
velox/functions/FunctionRegistry.cpp Outdated Show resolved Hide resolved
velox/functions/CoverageUtil.h Outdated Show resolved Hide resolved
velox/functions/CoverageUtil.h Outdated Show resolved Hide resolved
velox/functions/FunctionRegistry.cpp Outdated Show resolved Hide resolved
@aditi-pandit aditi-pandit changed the title [WIP] Support retrieving function metadata [WIP] Add FunctionRegistry APIs to retrieve function metadata. Mar 26, 2024
@aditi-pandit
Copy link
Collaborator

aditi-pandit commented Mar 26, 2024

@pramodsatya : How do we take care that we get only Prestissimo functions from the registry ? If there is a conflict between a Prestissimo and Spark function with the same name, then how do we disambiguate ?

@pramodsatya
Copy link
Collaborator Author

pramodsatya commented Mar 27, 2024

Thanks for the feedback @aditi-pandit, @czentgr. We are not checking that only Prestissimo functions are retrieved from the registry. Instead we rely on the Prestissimo worker to have registered only the presto functions, such as in this function, so there are no Spark functions in the registry. Please let me know if this is fine or whether we should have an additional way to distinguish between Presto and Spark functions in the registry.

@pramodsatya pramodsatya force-pushed the get_fn_metadata branch 3 times, most recently from c2011b1 to 8a28ec9 Compare March 29, 2024 02:49
@pramodsatya pramodsatya changed the title [WIP] Add FunctionRegistry APIs to retrieve function metadata. Add FunctionRegistry APIs to retrieve function metadata. Jun 7, 2024
@pramodsatya
Copy link
Collaborator Author

Hi @aditi-pandit, @czentgr, could you please take another look at this PR?

velox/functions/CMakeLists.txt Outdated Show resolved Hide resolved
velox/functions/FunctionRegistry.cpp Outdated Show resolved Hide resolved
velox/functions/FunctionRegistry.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator Author

@pramodsatya pramodsatya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @aditi-pandit, addressed the comments. Could you please take another look?

velox/functions/CMakeLists.txt Outdated Show resolved Hide resolved
velox/functions/FunctionRegistry.cpp Outdated Show resolved Hide resolved
@pramodsatya pramodsatya marked this pull request as ready for review August 2, 2024 00:31
Copy link
Collaborator

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pramodsatya

velox/functions/FunctionRegistry.cpp Outdated Show resolved Hide resolved
velox/functions/FunctionRegistry.cpp Outdated Show resolved Hide resolved
velox/functions/FunctionRegistry.cpp Outdated Show resolved Hide resolved
velox/functions/FunctionRegistry.h Outdated Show resolved Hide resolved
velox/functions/FunctionRegistry.h Outdated Show resolved Hide resolved
velox/functions/tests/FunctionRegistryTest.cpp Outdated Show resolved Hide resolved
@pramodsatya pramodsatya changed the title Add FunctionRegistry APIs to retrieve function metadata. Expose helper functions to retrieve registered functions from FunctionRegistry Aug 3, 2024
Copy link
Collaborator

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good now.

@@ -413,10 +423,11 @@ bool CompanionFunctionsRegistrar::registerMergeExtractFunction(

auto mergeExtractFunctionName =
CompanionSignatures::mergeExtractFunctionName(name);
return registerAggregateFunction(
return registerMergeExtractFunctionImpl(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename to registerMergeExtractFunctionInternal.

@aditi-pandit
Copy link
Collaborator

@pramodsatya. @czentgr : This is a reasonable solution as well, if we are okay exposing companion function concept to the services using Velox.

With the other API to return all functions with a boolean parameter to say 'skipInternalFunctions' a single function could be used to skip all companion or any other internal functions added by the Velox framework.

@mbasmanova : wdyt ?

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pramodsatya Pramod, would you update PR description to explain this change?

@pramodsatya
Copy link
Collaborator Author

@pramodsatya Pramod, would you update PR description to explain this change?

Apologies @mbasmanova, since this PR went through many iterations I had left it empty until the related discussion was concluded. I have updated the description now, could you please take another look?

@mbasmanova
Copy link
Contributor

@pramodsatya

whether the respective scalar, aggregate, and window functions are companion functions

Can a window function be a companion function? I assume not. If so, let's remove isCompanion function from window function metadata.

@@ -59,6 +62,11 @@ class VectorFunctionMetadataBuilder {
return *this;
}

VectorFunctionMetadataBuilder& isCompanionFunction(bool isCompanionFunction) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming: drop 'is';

builder.companionFunction(true)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming: drop 'is';

if (auto func = getAggregateFunctionEntry(sanitizedName)) {
return func->metadata;
} else {
VELOX_USER_FAIL("Metadata not found for aggregate function: {}", name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here is that aggregate function doesn't exist, not that it is missing metadata. Let's clarify.

VELOX_USER_FAIL("Aggregate function not found: {}", name);

const auto sanitizedName = sanitizeName(name);
if (auto func = getAggregateFunctionEntry(sanitizedName)) {
return func->metadata;
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: drop else after return

@@ -43,14 +43,15 @@ class WindowFunction {
kRows,
};

/// Indicates whether this is an aggregate window function and its process
/// unit.
/// Indicates whether this is an aggregate window function, whether it is a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

...whether the corresponding aggregate function is a companion function....

struct Metadata {
ProcessMode processMode;
bool isAggregate;
bool isCompanionFunction;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, clarify that this can be true iff isAggregate is true.

@@ -40,6 +40,9 @@ struct VectorFunctionMetadata {
/// In this case, 'rows' in VectorFunction::apply will point only to positions
/// for which all arguments are not null.
bool defaultNullBehavior{true};

/// Indicates if this is a companion function.
bool isCompanionFunction{false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistency: drop 'is' to math other booleans here

@@ -476,6 +476,9 @@ struct AggregateFunctionMetadata {
/// True if results of the aggregation depend on the order of inputs. For
/// example, array_agg is order sensitive while count is not.
bool orderSensitive{true};

/// Indicates if this is a companion function.
bool isCompanionFunction{false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistency: drop 'is' to math the other boolean

@@ -75,7 +75,7 @@ void registerRowNumber(const std::string& name, TypeKind resultTypeKind) {
exec::registerWindowFunction(
name,
std::move(signatures),
{exec::WindowFunction::ProcessMode::kRows, false},
{exec::WindowFunction::ProcessMode::kRows, false, false},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem right to allow specifying isCompanion for a window function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted, thanks.

@@ -142,7 +142,7 @@ void registerAverageAggregate(
}
}
},
{false /*orderSensitive*/},
{false /*orderSensitive*/, false /*isCompanionFunction*/},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is strange to have an API that has both isCompanion and withCompanionFunctions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it seems a bit strange, but isCompanionFunction is a part of the function metadata which indicates if the function being registered currently is a companion function, whereas withCompanionFunctions indicates if the aggregate should be registered with companion functions. So both these fields would be needed.
Could you please share how this could be made more readable?

@@ -356,6 +357,23 @@ TEST_F(FunctionRegistryTest, isDeterministic) {
ASSERT_FALSE(isDeterministic("not_found_function").has_value());
}

TEST_F(FunctionRegistryTest, isCompanionFunction) {
functions::prestosql::registerAllScalarFunctions();
// extract aggregate companion functions are registered as vector functions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typos

Copy link
Collaborator Author

@pramodsatya pramodsatya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @mbasmanova. The companion function metadata added for window functions is now reverted and a check is added in registerAggregateFunction to ensure aggregate companion functions are no longer registered as window functions.
Could you please take another look?

@@ -75,7 +75,7 @@ void registerRowNumber(const std::string& name, TypeKind resultTypeKind) {
exec::registerWindowFunction(
name,
std::move(signatures),
{exec::WindowFunction::ProcessMode::kRows, false},
{exec::WindowFunction::ProcessMode::kRows, false, false},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted, thanks.

@@ -142,7 +142,7 @@ void registerAverageAggregate(
}
}
},
{false /*orderSensitive*/},
{false /*orderSensitive*/, false /*isCompanionFunction*/},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it seems a bit strange, but isCompanionFunction is a part of the function metadata which indicates if the function being registered currently is a companion function, whereas withCompanionFunctions indicates if the aggregate should be registered with companion functions. So both these fields would be needed.
Could you please share how this could be made more readable?

@pramodsatya pramodsatya changed the title Add isCompanionFunction to function metadata feat: Add companionFunction to function metadata Dec 10, 2024
@aditi-pandit aditi-pandit requested a review from pedroerp January 15, 2025 18:21
@aditi-pandit
Copy link
Collaborator

@pedroerp : Pinged you about this review as well. Would you be able to help us to close this ? Appreciate it.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@aditi-pandit aditi-pandit added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Jan 16, 2025
@aditi-pandit
Copy link
Collaborator

Adding ready-to-merge label post Masha's review.

@facebook-github-bot
Copy link
Contributor

@bikramSingh91 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@@ -357,6 +358,23 @@ TEST_F(FunctionRegistryTest, isDeterministic) {
ASSERT_FALSE(isDeterministic("not_found_function").has_value());
}

TEST_F(FunctionRegistryTest, companionFunction) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is failing internally with the following crash stacktrace:
Will dig into the issue and report back.

*** Aborted at 1738204790 (Unix time, try 'date -d @1738204790') ***
*** Signal 6 (SIGABRT) (0x560a0007ff20) received by PID 524064 (pthread TID 0x7fd3b6f5bd40) (linux TID 524064) (maybe from PID 524064, UID 22026) (code: -6), stack trace: ***
    @ 000000000000f607 folly::symbolizer::(anonymous namespace)::innerSignalHandler(int, siginfo_t*, void*)
                       ./fbcode/folly/debugging/symbolizer/SignalHandler.cpp:453
    @ 000000000000dd81 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
                       ./fbcode/folly/debugging/symbolizer/SignalHandler.cpp:474
    @ 000000000004455f (unknown)
    @ 000000000009c993 pthread_kill
    @ 00000000000444ac raise
    @ 000000000002c432 abort
    @ 000000000044e7bb std::__replacement_assert(char const*, int, char const*, char const*)
                       fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/x86_64-facebook-linux/bits/c++config.h:535
    @ 00000000007271df std::_Optional_base_impl<facebook::velox::exec::VectorFunctionMetadata, std::_Optional_base<facebook::velox::exec::VectorFunctionMetadata, true, true> >::_M_get()
                       fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/optional:440
    @ 0000000000727144 std::optional<facebook::velox::exec::VectorFunctionMetadata>::operator->()
                       fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/optional:919
    @ 000000000042155f facebook::velox::FunctionRegistryTest_companionFunction_Test::TestBody()
                       ./fbcode/velox/functions/tests/FunctionRegistryTest.cpp:371
    @ 00000000001211be void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)
                       fbsource/src/gtest.cc:2675
    @ 0000000000120a44 testing::Test::Run()
                       fbsource/src/gtest.cc:2692
    @ 000000000012668f testing::TestInfo::Run()
                       fbsource/src/gtest.cc:2841
    @ 000000000012e646 testing::TestSuite::Run()
                       fbsource/src/gtest.cc:3020
    @ 0000000000169fab testing::internal::UnitTestImpl::RunAllTests()
                       fbsource/src/gtest.cc:5925
    @ 000000000016900b bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)
                       fbsource/src/gtest.cc:2675
    @ 0000000000168549 testing::UnitTest::Run()
                       fbsource/src/gtest.cc:5489
    @ 0000000000002980 RUN_ALL_TESTS()
                       fbsource/gtest/gtest.h:2317
                       -> ./fbcode/common/gtest/LightMain.cpp
    @ 000000000000260f main
                       ./fbcode/common/gtest/LightMain.cpp:20

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks @bikramSingh91.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like for the functions you used "array_frequency", "bitwise_left_shift", "ceil" are simple functions and are registered in another registry so exec::getVectorFunctionMetadata(function) returns a nullopt in this case which then crashes with a segfault when the test tries to deref it to access companionFunction.

I am not sure why its does not fail in the pre-commit tests, I also dont see velox_function_registry_test being run in the CI job. @kgpai any idea why that might be happening?

Can you check if this test runs on your local? wanna make sure that it fails

to fix this you can either use some vector functions instead or use the simple function registry to fetch their metadata, see (

const auto simpleFunctions =
exec::simpleFunctions().getFunctionSignaturesAndMetadata(functionName);
const auto metadata = exec::getVectorFunctionMetadata(functionName);
) for reference.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out, updated accordingly. Interestingly, the test runs locally both with exec::getVectorFunctionMetadata, and with the simple function registry lookup.

Copy link
Contributor

@bikramSingh91 bikramSingh91 Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pramodsatya Thanks for the update, main branch seems to be broken and the fix (#12223) is in works, do you mind waiting for it to merge and then rebase this change so we can get a clean CI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pramodsatya the build has been fixed, do you mind rebasing the PR so I can import it again? Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @bikramSingh91, rebased the PR.

@facebook-github-bot
Copy link
Contributor

@bikramSingh91 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@bikramSingh91 merged this pull request in 2bb7aab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants