-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-17475: [Go] Function interface and Registry impl #13924
ARROW-17475: [Go] Function interface and Registry impl #13924
Conversation
I don't know who else would be good to review this, I'm pulling small pieces out of #13909 in order to have them be small enough for reasonable reviewing. Feel free to add anyone else that might be relevant / be able to give good advice or ideas. Thanks! |
Sorry if I missed it, but is there an overall design doc or something? |
@lidavidm I'm pretty much following the design of the existing C++ compute library rather than trying to engineer something new on my own. So as a result I didn't make a separate design doc, unless you think it would still be a good idea to do that in which case I'll pause this to go do that Given that it's pretty much just me still building this, I figured that following the existing design would make life easier for everyone, haha. |
No worries, if it's meant to be based on the C++ library then I'll read through it with that in mind. Thanks! |
// Package compute is a native-go implementation of an Acero-like | ||
// arrow compute engine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so not just arrow::compute, but Acero itself is the goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the goal yup. But it's a lofty goal. The first milestone is just simple scalar function execution...
Kind() FuncKind | ||
Arity() Arity | ||
Doc() FunctionDoc | ||
NumKernels() int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth exposing this without also having a public concept of Kernels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a setup concept for Kernels, just haven't placed the interface in here yet (for the sake of keeping the size of this PR down) but they will almost certainly make into the 10.0.0 release so I'm okay with leaving this.
go/arrow/compute/registry.go
Outdated
"golang.org/x/exp/slices" | ||
) | ||
|
||
type FunctionOptionsType interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I think we only did this for C++ metaprogramming reasons. Is there a more Go-like way to do this? For instance Copy() could be a regular method and Compare() could become Equal()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, OptionsTypes get registered in the registry, but looking at things, I don't actually see a direct tie of Function
s -> their options types other than by the name (unless i'm missing something). So what is the actual use for registering the Options types in the Function Registry in the C++ Compute lib. You're right that I could place Copy
and Compare
as part of the FunctionOptions
interface (as Clone
and Equals
) rather than needing a separate FunctionOptionsType
interface. I think i'll do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was to allow metaprogramming/serialization so you may still need an OptionsType or something like that (or maybe not depending on how those get implemented) but yeah, I would be surprised if you needed the exact same interface in Golang as C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Only some nit comments about concurrency.
if reg.parent != nil { | ||
n = reg.parent.NumFunctions() | ||
} | ||
return n + len(reg.nameToFunction) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is RLock
required here?
return false | ||
} | ||
if !allowOverwrite { | ||
_, ok := reg.nameToFunction[name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reg.parent.lock
looks not locked, will it be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! i'll add a lock of reg.parent.mx
here.
Relating to the building of the functionality for Compute in Go with Arrow, this is the implementation of ArraySpan / ExecValue / ExecResult etc. It was able to be separated out from the function interface definitions, so I was able to make this PR while #13924 is still being reviewed Authored-by: Matt Topol <zotthewizard@gmail.com> Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Benchmark runs are scheduled for baseline = 5f66708 and contender = 5f84335. 5f84335 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
['Python', 'R'] benchmarks have high level of regressions. |
Relating to the building of the functionality for Compute in Go with Arrow, this is the implementation of ArraySpan / ExecValue / ExecResult etc. It was able to be separated out from the function interface definitions, so I was able to make this PR while apache#13924 is still being reviewed Authored-by: Matt Topol <zotthewizard@gmail.com> Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Authored-by: Matt Topol <zotthewizard@gmail.com> Signed-off-by: Matt Topol <zotthewizard@gmail.com>
No description provided.