Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Call overload #8008

Closed
wants to merge 13 commits into from
Closed

WIP: Call overload #8008

wants to merge 13 commits into from

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Aug 14, 2014

@stevengj Turning this into a PR so that I can comment on it easier.

@vtjnash
Copy link
Member Author

vtjnash commented Aug 14, 2014

I felt it might be useful to just put up the entire log of my debugging session, as a demo:
https://gist.github.com/vtjnash/4420c11ede5eb71554fc

This looks fun. Somewhere between trying to create an array, and eval'ing some Expr, this code jumped in and tried to call Base.eval via a call to call. start got involved to handle the tuple unpacking.

@JeffBezanson
Copy link
Member

Working on this now.

@toivoh
Copy link
Contributor

toivoh commented Sep 1, 2014

Yay!

@JeffBezanson
Copy link
Member

Somewhat amazingly, the build passed.

However now we have to figure out how to scope call. When we use it for all constructors, we won't be able to use a single call for both bootstrap stages. I think ideally, it would behave as if call sites f(x) were written as

if isa(f,Function)
    f(x)
else
    global call
    call(f, x)
end

This is easy to handle in codegen since it knows what module the current function is in. However there is no way to do this from jl_f_apply. One approach is to pass in the needed call, e.g. apply(call, f, ...). This is viable for auto-generated calls to apply, and for kwcall, but not for manual calls to apply. One solution is to only support manual use of apply for actual functions.

@JeffBezanson
Copy link
Member

A quick survey reveals apply() is used directly very rarely. Maybe it should be deprecated.

@stevengj
Copy link
Member

stevengj commented Sep 2, 2014

My thinking was that apply is a low-level thing and would only work for Functions. If you want to use apply with a constructor, you have to do apply(call, ...). Similarly for jl_f_apply in the C API (since the C API is especially entitled to be low-level).

@stevengj
Copy link
Member

stevengj commented Sep 2, 2014

apply seems to be used in the following packages:

DataFrames
FunctionalUtils
LazySequences
Murmur3
PTools
ReverseDiffSource
SDE
TermWin
TopicModels

Of these, only TermWin.jl by @tonyhffong seems to use apply with a type constructor.

@johnmyleswhite
Copy link
Member

I'm pretty sure we can remove any uses of apply from DataFrames.

@stevengj
Copy link
Member

stevengj commented Sep 2, 2014

It's nice to have apply if it is much more efficient than splatting, but of course there seems to be no reason that splatting couldn't be made equally efficient. The C API needs some low-level apply-like mechanism, but restricting that to Function seems fine.

@JeffBezanson
Copy link
Member

Splatting is lowered to call apply, so there's no performance difference. There doesn't seem to be any reason to expose apply to julia code since it maps one-to-one to splatting.

@tonyhffong
Copy link

Although I'm not following the discussion, I can work around that as well if this should be deprecated.

@jiahao jiahao force-pushed the master branch 2 times, most recently from 2ef98c5 to 0388647 Compare October 5, 2014 00:57
@JeffBezanson
Copy link
Member

I found one kind of situation where calling apply is useful: apply(f, lists...), which is basically equivalent to f(append(lists...)...), except (1) is probably more efficient since it can avoid the call to append, and (2) append does not really exist.

However this is not super compelling, since it doesn't generalize --- it only goes one level deeper into the argument. To get another level you'd need f(append(append(lists...)...)...). So I think a better approach is to add append, or perhaps a multi-argument collect. Note apply already uses a function called Base.append_any internally, which we could just expose.

- look up call in the right module instead of using jl_call_func
- some inference support for call overloading
- pass `call` to _apply and kwcall so they use the containing module's definition
- deprecate explicit apply(), rename the builtin to _apply
@JeffBezanson
Copy link
Member

Sadly this branch now seems to have a 3-4x longer startup time than master. No idea why yet.

@jakebolewski
Copy link
Member

It seems like most of the slowdown is running inference on the initial Base.__init__() call.

@StefanKarpinski
Copy link
Member

Since the inner/outer constructor business is one of the biggest causes of confusion in the current system, I think that some generalization and change there may well not be a bad thing at all. Excited to see how this pans out. I still wonder if the "everything is one big generic apply function" model isn't good.

@JeffBezanson
Copy link
Member

There is a subtle issue with the "everything is one generic apply function" model: it becomes impossible, in some sense, to actually pass functionality to somebody else. If I pass you f, it will ultimately get fed to your call generic function which could in principle be different from mine. I can't really pass you an f where I entirely control what f does. This may be no more than a feeling, but it seems like something is lost.

@StefanKarpinski
Copy link
Member

Interesting. I'll have to think about that.

@StefanKarpinski
Copy link
Member

What if there is only one call generic function? Then the control comes from what f you pass.

@JeffBezanson
Copy link
Member

Well, of course that is what allows the idea to work and might mean the "problem" is only psychological.

I'm thinking from an ABI perspective. You would have to work to enforce the "only one call" rule: all julia code would have to be handled by our linker, which would need to merge and reconcile all the call definitions, effectively whenever something is dlopened.

Another way to look at it is that every "function" is inherently in two places that need to be kept in sync: first you make a "token" for a function, then you must insert it in a global table, then I can pass it to you. As opposed to simply having a function object that I pass a pointer to.

Still another way to look at it is that our C API can give you a C-callable function pointer to some julia code. So there is in fact some underlying "callable object", but it's not a first-class object in the language.

Again, none of this actually breaks anything; it may only be a matter of compiler and linker efficiency.

@staticfloat
Copy link
Member

@JeffBezanson what you're describing sounds a lot like a vtable for all of Julia, but dereferenced at compile-time, not run-time. Am I thinking about it right? ;)

@JeffBezanson
Copy link
Member

Kind of, but julia doesn't do anything at compile time except as an optimization, so you have to think about running pieces of code passing stuff around.

One can imagine enforcing "only one call", but then we have to think about whether it is actually desirable. For example, imagine compiling an entire julia system image into a single .so for the purpose of exporting a single C-callable function double f(double). After doing that, we could change around all our julia code and repeat the process, getting a second .so whose purpose is to export double g(double).

Now we want to use both of these libraries in a single process. What happens? With a single call, there might be un-resolvable conflicts between the two modules. If we let each module have its own call, they could coexist peacefully. However then you couldn't pass nontrivial things between the modules, because one wouldn't know how to call things in the other. In turn you could solve that by passing around pairs (call, f) --- i.e. you pass the "function" and the call function that it needs to work. However this is exactly a closure: a code,data pair (:head explodes:).

From this thought experiment I conclude that having multiple separate first-class generic functions is a useful abstraction. However I'm not quite sure when to use them. With this branch we already have the choice of making a new generic function, or overloading call. This is very loosely a templates-vs-classes kind of decision. We have types using call, and doubtless all sorts of "functors" will as well. Perhaps many generic functions should use call instead. The decision to make a separate generic function would have something to do with modularity, but I haven't fully grasped what exactly. It definitely has some of the character of picking between class and instance methods, or templates and classes.

@JeffBezanson
Copy link
Member

call has taken convert's record:

julia> length(methods(convert))
455

julia> length(methods(call))
778

I wouldn't say I'm surprised, but it's still kind of amazing that there are 778 constructors in Base.

@StefanKarpinski
Copy link
Member

That's pretty nuts / cool.

@StefanKarpinski
Copy link
Member

It's also kind of nice to make these all methods of a single generic function. Feels much more first class.

@vtjnash
Copy link
Member Author

vtjnash commented Oct 13, 2014

However this is exactly a closure: a code,data pair

it's almost like the Type should contain an environment field and perhaps a lambda info too that points to the function that should be used when "calling" that typename. oh, wait – that's exactly what we have now. oops, hmm...

@@ -186,6 +186,12 @@ const Nothing = Void
export None
const None = Union()

export apply
function apply(f, args...)
depwarn("apply() is deprecated, use `...` instead", :apply)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply(f, x) is deprecated, use f(x...) instead

@timholy
Copy link
Member

timholy commented Oct 14, 2014

Recording here that @vtjnash mentioned a good reason to consider merging #8656 before this (#8656 (comment)).

@@ -446,6 +447,17 @@ void jl_module_run_initializer(jl_module_t *m)
}
}

jl_function_t *jl_module_call_func(jl_module_t *m)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you transform x(...) into call(x, ...), shouldn't this always use Base.call?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't, because there isn't just one Base (during bootstrap). In practice, everybody will use Base.call, like we use Base.convert now, but we at least want to preserve the possibility of a module with a separate call.

@stevengj
Copy link
Member

Since outer constructors always appear after the type definition, it doesn't seem too impractical to detect that the function name is a type and convert them to call methods. Is there a technical difficulty I'm missing?

Besides backwards compatibility, I think it would just be too confusing to require Foo(...) for inner constructors and call(::Type{Foo}, ...) for outer constructors. The latter option should certainly be allowed, as it gives greater flexibility, but I shudder at the thought of explaining such a requirement to newcomers.

@JeffBezanson
Copy link
Member

Yes, I agree. I think this is in pretty good shape now.

I find that the constructor change can be detrimental to type information. For example:

julia> methods(call, (Type{VersionNumber}, Any))
3-element Array{Any,1}:
 call(::Type{VersionNumber},x::Integer) at version.jl:39
 call(::Type{VersionNumber},v::String) at version.jl:82 
 call(args...) at base.jl:10                            

Previously, we knew that the first two methods were the only constructors for VersionNumber, so we could get a sharp type for VersionNumber(::Any). Now we can't, since there might be a different matching call based on the second argument. This can be avoided by not having any "fallback" definition of call, but that doesn't seem realistic. Oh well.

@StefanKarpinski
Copy link
Member

Some of that could be reclaimed with monotonic dispatch as @toivoh once proposed.

@stevengj
Copy link
Member

Why do we need call(args...) = Core.call(args...)?

It seems like the only fallback that we really need is call(f::Function, args...) = f(args...).

@JeffBezanson
Copy link
Member

Core has a separate definition of call for its constructors, to allow Base to be replaced. I don't think call(f::Function, args...) is necessary. However we would like call(T::Type, args...) = convert(T, args...), which also conflicts with the Core.call fallback.

I've considered explicitly "importing" Core's call definitions using reflection; e.g. a loop that defines call(::Type{ASCIIString}, args...) = Core.call(ASCIIString, args...) etc. This would solve it but isn't very satisfying.

@JeffBezanson
Copy link
Member

superseded by #8712.

@JeffBezanson JeffBezanson deleted the call_overload branch October 25, 2014 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants