Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-17475: [Go] Function interface and Registry impl #13924

Merged
merged 4 commits into from
Aug 22, 2022

Conversation

zeroshade
Copy link
Member

No description provided.

@github-actions
Copy link

@zeroshade
Copy link
Member Author

I don't know who else would be good to review this, I'm pulling small pieces out of #13909 in order to have them be small enough for reasonable reviewing. Feel free to add anyone else that might be relevant / be able to give good advice or ideas. Thanks!

@zeroshade zeroshade requested a review from emkornfield August 19, 2022 15:21
@lidavidm
Copy link
Member

Sorry if I missed it, but is there an overall design doc or something?

@zeroshade
Copy link
Member Author

zeroshade commented Aug 19, 2022

@lidavidm I'm pretty much following the design of the existing C++ compute library rather than trying to engineer something new on my own. So as a result I didn't make a separate design doc, unless you think it would still be a good idea to do that in which case I'll pause this to go do that

Given that it's pretty much just me still building this, I figured that following the existing design would make life easier for everyone, haha.

@lidavidm
Copy link
Member

No worries, if it's meant to be based on the C++ library then I'll read through it with that in mind. Thanks!

Comment on lines +17 to +18
// Package compute is a native-go implementation of an Acero-like
// arrow compute engine.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so not just arrow::compute, but Acero itself is the goal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the goal yup. But it's a lofty goal. The first milestone is just simple scalar function execution...

Kind() FuncKind
Arity() Arity
Doc() FunctionDoc
NumKernels() int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth exposing this without also having a public concept of Kernels?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a setup concept for Kernels, just haven't placed the interface in here yet (for the sake of keeping the size of this PR down) but they will almost certainly make into the 10.0.0 release so I'm okay with leaving this.

"golang.org/x/exp/slices"
)

type FunctionOptionsType interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I think we only did this for C++ metaprogramming reasons. Is there a more Go-like way to do this? For instance Copy() could be a regular method and Compare() could become Equal()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, OptionsTypes get registered in the registry, but looking at things, I don't actually see a direct tie of Functions -> their options types other than by the name (unless i'm missing something). So what is the actual use for registering the Options types in the Function Registry in the C++ Compute lib. You're right that I could place Copy and Compare as part of the FunctionOptions interface (as Clone and Equals) rather than needing a separate FunctionOptionsType interface. I think i'll do that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was to allow metaprogramming/serialization so you may still need an OptionsType or something like that (or maybe not depending on how those get implemented) but yeah, I would be surprised if you needed the exact same interface in Golang as C++.

Copy link
Contributor

@cyb70289 cyb70289 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Only some nit comments about concurrency.

if reg.parent != nil {
n = reg.parent.NumFunctions()
}
return n + len(reg.nameToFunction)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is RLock required here?

return false
}
if !allowOverwrite {
_, ok := reg.nameToFunction[name]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reg.parent.lock looks not locked, will it be a problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! i'll add a lock of reg.parent.mx here.

zeroshade added a commit that referenced this pull request Aug 22, 2022
Relating to the building of the functionality for Compute in Go with Arrow, this is the implementation of ArraySpan / ExecValue / ExecResult etc.

It was able to be separated out from the function interface definitions, so I was able to make this PR while #13924 is still being reviewed

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
@zeroshade zeroshade merged commit 5f84335 into apache:master Aug 22, 2022
@zeroshade zeroshade deleted the arrow-17454-function-registry branch August 22, 2022 19:39
@ursabot
Copy link

ursabot commented Aug 22, 2022

Benchmark runs are scheduled for baseline = 5f66708 and contender = 5f84335. 5f84335 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.27% ⬆️0.0%] test-mac-arm
[Failed ⬇️0.82% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.11% ⬆️0.11%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 5f84335f ec2-t3-xlarge-us-east-2
[Finished] 5f84335f test-mac-arm
[Failed] 5f84335f ursa-i9-9960x
[Finished] 5f84335f ursa-thinkcentre-m75q
[Finished] 5f667089 ec2-t3-xlarge-us-east-2
[Finished] 5f667089 test-mac-arm
[Failed] 5f667089 ursa-i9-9960x
[Finished] 5f667089 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ursabot
Copy link

ursabot commented Aug 22, 2022

['Python', 'R'] benchmarks have high level of regressions.
ursa-i9-9960x

zagto pushed a commit to zagto/arrow that referenced this pull request Oct 7, 2022
Relating to the building of the functionality for Compute in Go with Arrow, this is the implementation of ArraySpan / ExecValue / ExecResult etc.

It was able to be separated out from the function interface definitions, so I was able to make this PR while apache#13924 is still being reviewed

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
zagto pushed a commit to zagto/arrow that referenced this pull request Oct 7, 2022
Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants